【spark实战】大数据部署平台spark扩展新增节点安装文档
本案例规划:
原集群:
5台PC机
操作系统:RedHat Enterprise Server 6.5
root用户及密码:password
hadoop用户及密码:password
新增后集群:
4台PC机
操作系统:RedHat Enterprise Server 6.5
root用户及密码:password
hadoop用户及密码:password
集群内各服务器添加用户:
[[email protected] ~]#useradd hadoop
[[email protected] ~]#passwd hadoop
注:在新节点中添加
[[email protected] ~]#vim /etc/hosts
132.194.43.180 hadoop001
132.194.43.181 hadoop002
132.194.43.182 hadoop003
132.194.43.183 hadoop004
132.194.43.184 hadoop005 hivemysql
132.194.41.186 hadoop006
132.194.41.187 hadoop007
132.194.41.188 hadoop008
132.194.41.189 hadoop009
[[email protected] ~]#vim /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=hadoop001
注:重启生效(service network restart),如果生效不了,只有手动重启服务器,使其生效
配置所有服务器root用户之间的SSH互信以及hadoop用户之间的互信。
以下以root用户为例,hadoop用户配置同理。
[[email protected] ~]$ ssh-******
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): [Enter key]
Enter passphrase (empty for no passphrase): [Enter key]
Enter same passphrase again: [Enter key]
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
cd:5a:2a:bb:4a:49:97:8a:2d:70:19:18:60:56:9c:78 [email protected]
The key's randomart image is:
+--[ RSA 2048]----+
|+o+.. |
|o+ E |
|. o |
| o . o |
|. o . o S + |
| o + + + |
| o = . o |
| o o |
| ..o. |
[[email protected] ~]$ ssh-copy-id [email protected]
[email protected]'s password:
Now try logging into the machine, with "ssh '[email protected]'", and check in:
.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
[[email protected] ~]$ ssh 132.194.41.186
[[email protected] ~]$ exit
logout
Connection to132.194.41.186 closed.
注:其他新增节点同上(41.186 ,187 ,188 ,189)
注:所有新增节点上添加
[[email protected] 6~]#vim /etc/ntp.conf
[[email protected] 6~]#service ntpd start
此步在root用户下运行。
所有节点都需创建该目录,用于存放HDFS数据和临时数据。
主节点只需创建目录即可,数据节点需创建目录并挂载独立磁盘。
目录创建个数根据挂载磁盘数而定,本例为3个。
[[email protected] 1~]#mkdir -p /data/{disk1,disk2,disk3 }
[[email protected] 1~]#chown -R hadoop:hadoop /data/{disk1,disk2,disk3 }
[[email protected] 6~]#vim /etc/security/limits.conf,末尾添加
hadoop soft nofile 131072
hadoop hard nofile 131072
[[email protected] 6~]#vim /etc/security/limits.d/90-nproc.conf,末尾添加
hadoop soft nproc unlimited
hadoop hard nproc unlimited
7.1、软件包
将beh软件包解压缩至/opt目录下。并将/opt/beh目录修改为hadoop属主。
[[email protected] ~]#scp /opt/ beh.tar.gz /[email protected]:/opt/
[[email protected] ~]#tar -zxvf /opt/beh.tar.gz
[[email protected] ~]#chown -R hadoop:hadoop /opt/beh
7.2、修改环境变量
修改用户对应的如下文件:(添加的新节点)
[[email protected] ~]#vim /home/hadoop/.bash_profile,添加
source /opt/beh/conf/beh_env
source /opt/beh/conf/mysql_env
7.3、配置文件
7.3.1、spark
##此文件中为所有的worker节点,一般添加所有的datanode即可,每个一行
[[email protected] ~]#vim /opt/beh/core/spark/conf/slaves
hadoop003
hadoop004
hadoop005
hadoop006
hadoop007
hadoop008
hadoop009
7.3.2、HDFS:
此步可在hadoop下操作。红色字体为需要根据环境修改的地方。
[[email protected]~]#vim /opt/beh/core/hadoop/etc/hadoop/slaves
##此文件中hostname为所有datanode数据节点主机名,一行一个
##本例中datanode节点为hadoop003~hadoop009
hadoop003
hadoop004
hadoop005
hadoop006
hadoop007
hadoop008
hadoop009
7.3.2、yarn:
[[email protected]~]#vim /opt/beh/core/hadoop/etc/hadoop/yarn-site.xml
<property>
<name>yarn.nodemanager.local-dirs</name> <value>/data/disk1/tmp/yarn/local,/data/disk2/tmp/yarn/local,/data/disk3/tmp/yarn/local </value>
<final>false</final>
</property>
此路径应与实际磁盘相同,
7.4、 网络性能测试
Ø 测试工具为nerperf。以hadoop001为server端进行测试。其他节点为客户端测试与hadoop001之间的传输速率。测试命令如下为
Ø hadoop001为:
重启服务:
[[email protected]~]#/opt/beh/mon/netperf/bin/netserver
Ø hadoop002-hadoop012:
测试:
[[email protected]~]#/opt/beh/mon/netperf/bin/netperf -H 132.194.43.170 -t TCP_STREAM -l 10
[hadoop @hadoop001~]#cd /opt/beh/core/hadoop/bin
8.1、YARN
在hadoop001上执行:
[[email protected] ~]$start-yarn.sh
在hadoop002上启动resourcemanager:
[[email protected] ~]$yarn-daemon.sh start resourcemanager
在hadoop001上启动jobhistory server
[[email protected]~]$mr-jobhistory-daemon.sh start historyserver
来源: