yarn 搭建

1. 集群搭建说明

HA with QJM搭建

hadooclu1,hadooclu2 NN、ZKFC
hadooclu1,hadooclu2 ,hadooclu3 JNN
hadooclu2,hadooclu3,hadooclu4 DN NodeManager(NM)
hadooclu2,hadooclu3,hadooclu4 ZK
hadooclu2,hadooclu4 RS(ResourceManager)

注意DN和NM保持1:1关系,以便于计算向数据移动。
nn为了高可用增加了zkfc角色,是因为nn早期没有考虑到高可用问题。这里rs高可用已写在rs内。

2. 官网资料

官方配置参考资料以及配置项说明
下图中为基本的配置项:
yarn 搭建
yarn 搭建
yarn 搭建
找到上图位置,是一个最简配置项,如需配置其它选项,请参考该页面中具体到说明。

3. 具体修改配置项

3.1 mapred-site.xml

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

<configuration>
    <property>
        <name>mapreduce.application.classpath</name>
        <value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
    </property>
</configuration>

3.2 yarn-site.xml

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.env-whitelist</name>
        <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
    </property>

<property>
  <name>yarn.resourcemanager.ha.enabled</name>
  <value>true</value>
</property>
<property>
  <name>yarn.resourcemanager.cluster-id</name>
  <value>cluster1</value>
</property>
<property>
  <name>yarn.resourcemanager.ha.rm-ids</name>
  <value>rm1,rm2</value>
</property>
<property>
  <name>yarn.resourcemanager.hostname.rm1</name>
  <value>master1</value>
</property>
<property>
  <name>yarn.resourcemanager.hostname.rm2</name>
  <value>master2</value>
</property>
<property>
  <name>yarn.resourcemanager.webapp.address.rm1</name>
  <value>master1:8088</value>
</property>
<property>
  <name>yarn.resourcemanager.webapp.address.rm2</name>
  <value>master2:8088</value>
</property>
<property>
  <name>yarn.resourcemanager.zk-address</name>
  <value>zk1:2181,zk2:2181,zk3:2181</value>
</property>
</configuration>

配置完成后,分发到其它节点。

4. yarn启动

start-yarn.sh
如ResouceManager未启动,可在具体节点手工启动:
yarn-daemon.sh start ResourceManager

5.运行示例

输出到目录新的不存在的,由Hadoop自行创建。
hadoop jar hadoop/hadoop-3.0.3/share/hadoop/mapreduce/hadoop-mapreduce- examples-3.0.3.jar wordcount /input /output
查看计算结果:
hdfs dfs -ls /input /output
获取运算结果:
hdfs dfs -get 结果所在目录 目标目录