Spark集群和yarn一起搭建和简单实例
上传
解压
tar -zxvf spark-2.2.1-bin-hadoop2.6.tgz
切换目录
cd /home/spark-2.2.1-bin-hadoop2.6/conf/
修改配置文件spark-env.sh
mv spark-env.sh.template spark-env.sh
vim spark-env.sh
JAVA_HOME=/usr/java/jdk1.8.0_202
SPARK_MASTER_HOST=node01
SPARK_MASTER_PORT=7077
SPARK_WORKER_CORES=2
SPARK_WORKER_MEMORY=1g
HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
JAVA_HOME:配置java_home路径
SPARK_MASTER_IP:master的ip
SPARK_MASTER_PORT:提交任务的端口,默认是7077
SPARK_WORKER_CORES:每个worker从节点能够支配的core的个数
SPARK_WORKER_MEMORY:每个worker从节点能够支配的内存数
修改配置文件slaves(添加从节点)
mv laves.template slaves
vim slaves
node02
node03
同步到其他节点上
scp -r /home/spark-2.2.1-bin-hadoop2.6/ node02:/home/
scp -r /home/spark-2.2.1-bin-hadoop2.6/ node02:/home/
启动集群
切换目录
cd /home/spark-2.2.1-bin-hadoop2.6/sbin/
启动
./start-all.sh
检测(访问master:8080端口)
启动zookeeper
先关闭防火墙
service iptables stop
zkServer.sh start
启动hadoop2.0
start-all.sh
简单使用(计算π)
切换目录
cd /home/spark-2.2.1-bin-hadoop2.6/bin/
Client运行
./spark-submit --master yarn --class org.apache.spark.examples.SparkPi ../examples/jars/spark-examples_2.11-2.2.1.jar 100
Cluster运行
./spark-submit --master yarn --deploy-mode cluster --class org.apache.spark.examples.SparkPi ../examples/jars/spark-examples_2.11-2.2.1.jar 100
结束集群运行
yarn application -kill applicationID