Zookeeper集群配置
前言:大数据生态圈一个很重要的角色Zookeeper,是一个分布式开源框架,提供了协调分布式应用的基本服务,它向外部应用暴露一组通用服务-----分布式同步(Distributed Synchronization)、命名服务(Naming Service)、集群维护(Group Maintenance)。作用:简化分布式应用协调及其管理的难度,提供高性能的分布式服务。ZooKeeper本身可以以Standalone模式安装运行,不过它的长处在于通过分布式ZooKeeper集群(一个Leader,多个Follower),基于一定的策略来保证ZooKeeper集群的稳定性和可用性,从而实现分布式应用的可靠性。
本文搭建zookeeper集群,通过选举制度,来选出一个leader,两个follower,来保证集群的稳定性。
一 、准备工作。
三台虚拟机:
master :192.168.163.145
worker1:192.168.163.146
worker2:192.168.163.147
Java:jdk 1.8.x
三台虚拟机之间的SSH相互通信
以上工作在笔者的Hadoop集群安装配置中都有详细介绍。
https://blog.****.net/yangang1223/article/details/79883113
二、安装zookeeper
1.在zookeeper官网或者清华大学镜像站、阿里云镜像站等下载zookeeper3.4.6稳定版。
上传到master节点的/app/下
$tar -zxvf zookeeper-3.4.6.tar.gz
2.进入zookeeper的conf目录中,复制一份zoo.cfg
$cp zoo_sample.cfg zoo.cfg
编辑
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/app/zookeeper/zookeeper-3.4.6/data
dataLogDir=/app/zookeeper/zookeeper-3.4.6/log
# the port at which the clients will connect
clientPort=2181
server.1=master:2888:3888
server.2=worker1:2888:3888
server.3=worker2:2888:3888
然后创建zoo.cfg配置的dataDir和dataLogDir
并在dataDir里创建myid文件 ,给文件里面写上“1”
3.将zookeeper分发到其他两台节点上
[[email protected] app]$ scp -r zookeeper [email protected]:/app/
[[email protected] app]$ scp -r zookeeper [email protected]:/app/
4.将worker1和worker2的zookeeper的data目录下的myid文件的内容改成2、3
[[email protected] zookeeper-3.4.6]$ cd data/
[[email protected] data]$ ls
myid
[[email protected] data]$ echo "2" >myid
[[email protected] data]$ cat myid
2
[[email protected] data]$
[[email protected] zookeeper-3.4.6]$ cd data/
[[email protected] data]$ ls
myid
[[email protected] data]$ echo "3" >myid
[[email protected] data]$ cat myid
3
为了方便,你也可以在环境变量中添加ZOOKEEPER_HOME和bin目录,也可以不用。
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
JAVA_HOME=/app/java/jdk1.8.0_141
HADOOP_HOME=/app/hadoop/hadoop-2.7.3
SCALA_HOME=/app/scala/scala-2.11.8
SPARK_HOME=/app/spark/spark-2.1.1
ZOOKEEPER_HOME=/app/zookeeper/zookeeper-3.4.6
PATH=$PATH:$HOME/bin:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$SCALA_HOME/bin:$SPARK_HOME/bin:$SPARK_HOME/sbin:$ZOOKEEPER_HOME/bin
export PATH
5.将~/.bash_profile 分发给各节点 ,source后完成配置
三、启动zookeeper集群
三台依次启动或者你也可以写shell脚本一键使三台机器都启动
[[email protected] zookeeper-3.4.6]$ zkServer.sh start
JMX enabled by default
Using config: /app/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[[email protected] conf]$ zkServer.sh start
JMX enabled by default
Using config: /app/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[[email protected] conf]$ zkServer.sh start
JMX enabled by default
Using config: /app/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
三台都启动后我们可以查看状态,worker1被选举为leader,master和worker2被选举为follower
[[email protected] zookeeper-3.4.6]$ zkServer.sh status
查看jps进程
四、注意事项
1.三台机器一定要做好时钟同步
2.可能在启动了一台机器手就查看状态,发现报错,这是由于每台节点都在尝试连接其他节点,三台都启动后,查看结果,逐渐稳定。