使用Docker搭建hadoop集群

https://blog.csdn.net/qq_33530388/article/details/72811705

 

外网访问

iptables -t nat -A  DOCKER -p tcp --dport 50070 -j DNAT --to-destination 172.17.0.3:50070

  

------------------------------------------------------------

1 下载hadoop镜像

docker pull registry.cn-hangzhou.aliyuncs.com/kaibb/hadoop

2 建立容器(一主三奴)

    50070        //namenode http port
    50075        //datanode http port
    50090        //2namenode    http port

    8020        //namenode rpc port
    50010        //datanode rpc port

docker run -i -t --name Master -h Master -p 8020:8020 -p 8485:8485 -p 50010:50010 -p 50020:50020 -p 50070:50070 -p 50075:50075 -p 50470:50470 -p 50475:50475 -p 50090:50090 registry.cn-hangzhou.aliyuncs.com/kaibb/hadoop /bin/bash

docker run -i -t --name Slave1 -h Slave1 registry.cn-hangzhou.aliyuncs.com/kaibb/hadoop /bin/bash

docker run -i -t --name Slave2 -h Slave2 registry.cn-hangzhou.aliyuncs.com/kaibb/hadoop /bin/bash

docker run -i -t --name Slave3 -h Slave3 registry.cn-hangzhou.aliyuncs.com/kaibb/hadoop /bin/bash

使用Docker搭建hadoop集群

3 启动四个container进入docker

docker start container_id 

docker exec -it container_name /bin/bash

4 ssh免密授权  四机器联动

 /etc/init.d/ssh start
 ssh-****** -t rsa

使用Docker搭建hadoop集群


cd
cd .ssh
cat id_rsa.pub >authorized_keys 
 cat authorized_keys

使用Docker搭建hadoop集群

四个机器操作一致,最后authorized_keys内容拷贝到一个文件里,然后覆盖所有机子上authorized_keys,即四机器

authorized_keys文件一致。举例:

使用Docker搭建hadoop集群

四个机子更改  hosts文件

ip address获取每个机子ip

172.17.0.11    Master
172.17.0.15    Slave1
172.17.0.16    Slave2
172.17.0.17    Slave3

 

5 更改slaves文件

使用Docker搭建hadoop集群

 

6 配置文件5个/opt/tools/hadoop/etc/hadoop:

cd /opt/tools/hadoop/etc/hadoop

echo "">core-site.xml

echo "">yarn-site.xml

echo ""> hdfs-site.xml

echo "">mapred-site.xml

vi core-site.xml

core-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

 <configuration>
<property>
<name>fs.defaultFS</name>
 <value>hdfs://Master/</value>
</property>
</configuration>
-----------------------------------------------

  yarn-site.xml 
<?xml version="1.0"?>
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>Master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
 

-----------------------------------------------

 hdfs-site.xml 
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>

<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
-----------------------------------------------

mapred-site.xml

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

=====

 

hadoop-env.sh:修改有关java的环境

export JAVA_HOME=/opt/tools/jdk1.8.0_77

vi slaves

Slave1
Slave2
Slave3
 

================================================

7 拷贝到slave机器

scp core-site.xml hadoop-env.sh hdfs-site.xml mapred-site.xml yarn-site.xml slaves Slave1:/opt/tools/hadoop/etc/hadoop/

scp core-site.xml hadoop-env.sh hdfs-site.xml mapred-site.xml yarn-site.xml slaves Slave2:/opt/tools/hadoop/etc/hadoop/

scp core-site.xml hadoop-env.sh hdfs-site.xml mapred-site.xml yarn-site.xml slaves Slave3:/opt/tools/hadoop/etc/hadoop/

 

8 格式化  开始  停止

 /opt/tools/hadoop/etc/hadoop/下执行

 hadoop namenode -format

记住:下次格式化之前要清空tmp

rm -rf /tmp/*

rm -rf Slave1:/tmp/*

rm -rf Slave2:/tmp/*

rm -rf Slave3:/tmp/*

 

cd ../../sbin

 

./stop-all.sh
./start-all.sh

 

10

jps

使用Docker搭建hadoop集群

获取报告路径

hadoop dfsadmin -report

DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

Configured Capacity: 31304097792 (29.15 GB)
Present Capacity: 26686562304 (24.85 GB)
DFS Remaining: 26686476288 (24.85 GB)
DFS Used: 86016 (84 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
Live datanodes (3):

Name: 172.17.0.15:50010 (Slave1)
Hostname: Slave1
Decommission Status : Normal
Configured Capacity: 10434699264 (9.72 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 1539166208 (1.43 GB)
DFS Remaining: 8895504384 (8.28 GB)
DFS Used%: 0.00%
DFS Remaining%: 85.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Oct 09 03:10:04 UTC 2018


Name: 172.17.0.17:50010 (Slave3)
Hostname: Slave3
Decommission Status : Normal
Configured Capacity: 10434699264 (9.72 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 1539203072 (1.43 GB)
DFS Remaining: 8895467520 (8.28 GB)
DFS Used%: 0.00%
DFS Remaining%: 85.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Oct 09 03:10:04 UTC 2018


Name: 172.17.0.16:50010 (Slave2)
Hostname: Slave2
Decommission Status : Normal
Configured Capacity: 10434699264 (9.72 GB)
DFS Used: 28672 (28 KB)
Non DFS Used: 1539166208 (1.43 GB)
DFS Remaining: 8895504384 (8.28 GB)
DFS Used%: 0.00%
DFS Remaining%: 85.25%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Oct 09 03:10:04 UTC 2018