Hadoop-分布式安装

一、概述

  • hadoop-3.1.1.tar.gz
  • 172.16.233.137(主节点)
  • 172.16.233.138(从节点)
  • 172.16.233.139(从节点)

二、环境准备(三台机器一样)

1、设置hosts

vi /etc/hosts

172.16.233.137 node1
172.16.233.138 node2
172.16.233.139 node3

2、三台机器设置免密登录

# 在node1、node2、node3中分别生成秘钥对(生成之后的文件在/root/.ssh/)
ssh-****** -t rsa

#将node2、node3中的id_rsa.pub拷贝到node1的/root/.ssh/目录下(node2_id_rsa.pub,node3_id_rsa.pub),并删除node2、node3中的authorized_keys
# 在node1上将id_rsa.pub、node2_id_rsa.pub、node3_id_rsa.pub 加入到 authorized_keys
cat id_rsa.pub >> authorized_keys
cat node2_id_rsa.pub >> authorized_keys
cat node3_id_rsa.pub >> authorized_keys

#将node1中的authorized_keys拷贝到节点node2、node3中
scp authorized_keys [email protected]:/root/.ssh/
scp authorized_keys [email protected]:/root/.ssh/

三、安装 (三台机器一样)

[[email protected] hadoop]# cd hadoop-3.1.1
[[email protected] hadoop-3.1.1]# ls
bin  etc  include  lib  libexec  LICENSE.txt  logs  NOTICE.txt  README.txt  sbin  share
[[email protected] hadoop-3.1.1]#

  1、配置hadoop-env.sh,添加:

# export HDFS_NAMENODE_USER=hdfs
export JAVA_HOME=/data/local/jdk1.8.0_191
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root

2、配置core-site.xml(主结点相关信息),添加

<configuration>
    <property>
	<name>fs.defaultFS</name>
	<value>hdfs://node1:9820</value>
    </property>
    #路径自定义:namenode、datanode的数据存放路径
    <property>
	<name>hadoop.tmp.dir</name>
	<value>/data/local/hadoop/data</value>	
    </property>
</configuration>

3、配置hdfs-site.xml(从节点相关信息),添加

<configuration>
    # 副本数:副本数婴<=datanode节点数
    <property>
	<name>dfs.replication</name>
	<value>2</value>
    </property>
    # 第二个主结点:做文件合并工作的
    <property>
	<name>dfs.namenode.secondary.http-address</name>
	<value>node2:9868</value>
    </property>
#hadoop没有使用host+hostName的配置方式时设置:dfs.namenode.datanode.registration.ip-hostname-check
    #<property>
    #    <name>dfs.namenode.datanode.registration.ip-hostname-check</name>
    #	<value>false</value>
    #</property>
</configuration>

4、配置从节点:workers,添加

node2
node3

5、启动hdfs

./bin/hdfs namenode -format	# 第一次启动时使用:生成元数据信息和clusterID
./sbin/start-dfs.sh

6、查看进程

# node1
[[email protected] hadoop-3.1.1]# jps
10657 NameNode
11515 Jps
[[email protected] hadoop-3.1.1]#

# node2
[[email protected] hadoop-3.1.1]# jps
2660 SecondaryNameNode
4022 Jps
2584 DataNode
[[email protected] hadoop-3.1.1]#

# node3
[[email protected] hadoop-3.1.1]# jps
2184 DataNode
3131 Jps
[[email protected] hadoop-3.1.1]#

7、访问:node1:9870

Hadoop-分布式安装

Hadoop-分布式安装