hdfs分布式搭建
所需环境
四台主机,一台NameNode,三台DataNode
NameNode | SecondaryNameNode | DataNode | |
node1 | 1 | ||
node2 | 1 | 1 | |
node3 | 1 | ||
node4 | 1 |
搭建NameNode主机流程,其余主机配置保持一致即可
一、修改主机名
(1)[[email protected] ~]# vim /etc/sysconfig/network
(2)修改每台主机上的/etc/hosts文件
a) 对于NameNode主机,需要在hosts文件中添加集群中所有节点的IP地址及对应的主机名:
b)对于DataNode,只需添加本机和NameNode的Ip地址和主机名。
(3)重启主机:[[email protected] ~]# reboot
二、时间同步 ntpdate -u -sla.time.edu.cn
关闭网络 hosts 防火墙 service iptables stop
三、免**配置
1.[[email protected] ~]# ssh-****** -t dsa -P '' -f ~/.ssh/id_dsa
2.[[email protected] .ssh]# cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
3.将id_dsa.pub文件复制到所有DataNode节点
[[email protected] .ssh]# scp id_dsa.pub node2:/tmp/
[[email protected] .ssh]# scp id_dsa.pub node3:/tmp/
[[email protected] .ssh]# scp id_dsa.pub node4:/tmp/
4.给其它DataNode节点执行相同的操作,至此免**登录配置完毕
node2:
[[email protected] ~]# ssh-****** -t dsa -P '' -f ~/.ssh/id_dsa
[[email protected] ~]# cat /tmp/id_dsa.pub >> /root/.ssh/authorized_keys
node3:
[[email protected] ~]# ssh-****** -t dsa -P '' -f ~/.ssh/id_dsa
[[email protected] ~]# cat /tmp/id_dsa.pub >> /root/.ssh/authorized_keys
node4:
[[email protected] ~]# ssh-****** -t dsa -P '' -f ~/.ssh/id_dsa
[[email protected] ~]# cat /tmp/id_dsa.pub >> /root/.ssh/authorized_keys
5.可以通过本机ssh 各个节点IP来测试是否需要密码登录
[[email protected] ~]#ssh node1
[[email protected] ~]#ssh node2
[[email protected] ~]#ssh node3
[[email protected] ~]#ssh node4
四、安装jdk
jdk-8u131-linux-x64.rpm
1.安装jdk
这里我们将jdk包放在home目录下,使用rpm命令安装:
[[email protected] home]# rpm -ivh /home/jdk*
2.配置环境变量
[[email protected] ~]# vim /etc/profile,在最末行加入:
export JAVA_HOME=/usr/java/jdk1.8.0_131
export PATH=$PATH:$JAVA_HOME/bin
[[email protected] ~]# source /etc/profile
[[email protected] ~]# java -version
3.同理,其它主机安装jdk
五、安装Hadoop
1.上传、解压Hadoop包
hadoop-2.8.0.tar.gz,这里我们将Hadoop包放在home目录下,解压缩后的文件放在当前目录:
[[email protected] home]# tar zxvf hado op-2.8.0.tar.gz
2.配置环境变量
[[email protected] hadoop-2.8.0]# vim /etc/profile,(下图最末两行所示)
export HADOOP_HOME=/home/hadoop-2.8.0
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
~
[[email protected] hadoop-2.8.0]# source /etc/profile
3.检查是否配置成功
[[email protected] hadoop-2.8.0]# hfds
[[email protected] hadoop-2.8.0]# start
六、先在NameNode上修改hadoop配置文件,最后同步配置文件
(1)修改配置文件
hadoop-env.sh 中的 JAVA_HOME
core-site.xml
hdfs-site.xml
slaves 指定DataNode
手动创建masters 指定SNN
1.配置hadoop-env.sh
将JAVA_HOME的值改为实际地址,改地址可通过!echo $JAVA_HOME查看
[[email protected] hadoop]# vim hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.8.0_131
2.配置core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://node1:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop</value>
</property>
</configuration>
3.配置hdfs-site.xml
<configuration>
<property>
<name>dfs.replication></name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address></name>
<value>node2:50090</value>
</property>
</configuration>
4.配置masters,填写node2
[[email protected] hadoop]# vim masters
5.配置slaves
(2)同步配置文件
将home下的hadoop-2.8.0同步到其它DataNode
[[email protected] home]# scp -r hadoop-2.8.0/ node2:/home/
[[email protected] home]# scp -r hadoop-2.8.0/ node3:/home/
[[email protected] home]# scp -r hadoop-2.8.0/ node4:/home/
七、格式化NameNode
[[email protected] home]# hdfs namenode -format
八、启动hdfs
[[email protected] ~]# start-dfs.sh
(停止命名 [[email protected] ~]# stop-dfs.sh)
用jps命令检测是否成功启动
[[email protected] hadoop]# jps