分布式架构学习之:027--ZooKeeper集群的安装、配置、高可用测试

Dubbo 建议使用 Zookeeper 作为服务的注册中心 

Zookeeper 集群中只要有过半的节点是正常的情况下,那么整个集群对外就是可用的。正是基于这个 特性要将 ZK 集群的节点数量要为奇数(2n+1如 357 个节点较为合适。 

ZooKeeper 与 Dubbo 服务集群架构

分布式架构学习之:027--ZooKeeper集群的安装、配置、高可用测试

服务器 1:192.168.1.81 端口:218128813881

服务器 2:192.168.1.82 端口:218228823882

服务器 3:192.168.1.83 端口:218328833883

一、集群节点安装和配置

1 修改操作系统的/etc/hosts 文件,添加 IP 与主机名映射:

# zookeeper cluster servers 192.168.1.81 edu-zk-01 192.168.1.82 edu-zk-02 192.168.1.83 edu-zk-03

2 下载或上传 zookeeper-3.4.6.tar.gz 到/home/wusc/zookeeper 目录: 

$ cd /home/wusc/zookeeper

$ wget http://apache.fayea.com/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz

3 解压 zookeeper 安装包,并按节点号对 zookeeper 目录重命名: 

$ tar -zxvf zookeeper-3.4.6.tar.gz

服务器 1:

$ mv zookeeper-3.4.6 node-01

服务器 2:

$ mv zookeeper-3.4.6 node-02

服务器 3:

$ mv zookeeper-3.4.6 node-03

4 在各 zookeeper 节点目录下创建以下目录:

$ cd /home/wusc/zookeeper/node-0X(X 代表节点号 123,以下同解)

$ mkdir data

$ mkdir logs

5、  zookeeper/node-0X/conf 目录下的 zoo_sample.cfg 文件拷贝一份,命名为 zoo.cfg: 

$ cp zoo_sample.cfg zoo.cfg

6、 修改 zoo.cfg 配置文件:

zookeeper/node-01 的配置(/home/wusc/zookeeper/node-01/conf/zoo.cfg)如下:

tickTime=2000

initLimit=10

syncLimit=5

dataDir=/home/wusc/zookeeper/node-01/data

dataLogDir=/home/wusc/zookeeper/node-01/logs

clientPort=2181

server.1=edu-zk-01:2881:3881

server.2=edu-zk-02:2882:3882

server.3=edu-zk-03:2883:3883

zookeeper/node-02 的配置(/home/wusc/zookeeper/node-02/conf/zoo.cfg)如下:

tickTime=2000

initLimit=10

syncLimit=5

dataDir=/home/wusc/zookeeper/node-02/data

dataLogDir=/home/wusc/zookeeper/node-02/logs

clientPort=2182

server.1=edu-zk-01:2881:3881

server.2=edu-zk-02:2882:3882

server.3=edu-zk-03:2883:3883

zookeeper/node-03 的配置(/home/wusc/zookeeper/node-03/conf/zoo.cfg)如下:

tickTime=2000

initLimit=10

syncLimit=5

dataDir=/home/wusc/zookeeper/node-03/data

dataLogDir=/home/wusc/zookeeper/node-03/logs

clientPort=2183

server.1=edu-zk-01:2881:3881 

server.2=edu-zk-02:2882:3882 

server.3=edu-zk-03:2883:3883

参数说明:

tickTime=2000tickTime 这个时间是作为 Zookeeper 服务器之间或客户端与服务器之间维持心跳的时间间隔,也就是每 tickTime 时间就会发送一个心跳。

initLimit=10initLimit 这个配置项是用来配置 Zookeeper 接受客户端(这里所说的客户端不是用户连接 Zookeeper服务器的客户端,而是 Zookeeper 服务器集群中连接到 Leader  Follower 服务器)初始化连接时最长能忍受多少个心跳时间间隔数。当已经超过 10 个心跳的时间(也就是 tickTime)长度后 Zookeeper 务器还没有收到客户端的返回信息,那么表明这个客户端连接失败。总的时间长度就是10*2000=20 秒。

syncLimit=5syncLimit 这个配置项标识 Leader 与 Follower 之间发送消息,请求和应答时间长度,最长不能超过多少 tickTime 的时间长度,总的时间长度就是 5*2000=10 秒。

dataDir=/home/wusc/zookeeper/node-01/data

dataDir 顾名思义就是 Zookeeper 保存数据的目录,默认情况下 Zookeeper 将写数据的日志文件也保存在这个目录里。

clientPort=2181

clientPort 这个端口就是客户端(应用程序)连接 Zookeeper 服务器的端口,Zookeeper 会监听这个端口接受客户端的访问请求。

server.A=BCD

server.1=edu-zk-01:2881:3881

server.2=edu-zk-02:2882:3882

server.3=edu-zk-03:2883:3883

A 是一个数字,表示这个是第几号服务器;

是这个服务器的 IP 地址(或者是与 IP 地址做了映射的主机名);

C 第一个端口用来集群成员的信息交换,表示这个服务器与集群中的 Leader 服务器交换信息的端口;

是在 leader 挂掉时专门用来进行选举 leader 所用的端口。

注意:如果是伪集群的配置方式,不同的Zookeeper 实例通信端口号不能一样,所以要给它们分配不同的端口号。

7 dataDir=/home/wusc/zookeeper/node-0X/data下创建 myid文件

编辑 myid 文件,并在对应的IP 的机器上输入对应的编号。如在node-01 上,myid 文件内容就是1,node-02 上就是 2node-03 上就是 3

$

vi /home/wusc/zookeeper/node-01/data/myid

## 值为 1

$

vi

/home/wusc/zookeeper/node-02/data/myid

##

值为 2

$

vi

/home/wusc/zookeeper/node-03/data/myid

##

值为 3


8 在防火墙中打开要用到的端口 218X、288X、388X

切换到 root 用户权限,执行以下命令:

chkconfig iptables on

service iptables start

编辑/etc/sysconfig/iptables

# vi /etc/sysconfig/iptables

如服务器 01 增加以下行:

## zookeeper

-A INPUT -m state --state NEW -m tcp -p tcp --dport 2181 -j ACCEPT 

-A INPUT -m state --state NEW -m tcp -p tcp --dport 2881 -j ACCEPT 

-A INPUT -m state --state NEW -m tcp -p tcp --dport 3881 -j ACCEPT

重启防火墙:

# service iptables restart

查看防火墙端口状态:

# service iptables status

9 启动并测试 zookeeper(要用 wusc 用户启动,不要用 root):

(1) 使用 wusc 用户到/home/wusc/zookeeper/node-0X/bin 目录中执行:

$ /home/wusc/zookeeper/node-01/bin/zkServer.sh start

$ /home/wusc/zookeeper/node-02/bin/zkServer.sh start 

$ /home/wusc/zookeeper/node-03/bin/zkServer.sh start

(2) 输入jps 命令查看进程:

$ jps

1456 

QuorumPeerMain

其中,QuorumPeerMain zookeeper 进程,说明启动正常

(3) 查看状态:

$ /home/wusc/zookeeper/node-01/bin/zkServer.sh status

(4) 查看 zookeeper 服务输出信息:

由于服务信息输出文件在/home/wusc/zookeeper/node-0X/bin/zookeeper.out 

$ tail -500f zookeeper.out

10、停止 zookeeper 进程:

$ zkServer.sh stop

11、配置 zookeeper 开机使用 wusc 用户启动:

编辑 node-01、node-02、node-03 中的/etc/rc.local 文件,分别加入: 

su - wusc -c '/home/wusc/zookeeper/node-01/bin/zkServer.sh start' 

su - wusc -c '/home/wusc/zookeeper/node-02/bin/zkServer.sh start' 

su - wusc -c '/home/wusc/zookeeper/node-03/bin/zkServer.sh start'

二、安装 Dubbo 管控台(主要是配置连接集群)

Dubbo 管控台可以对注册到 zookeeper 注册中心的服务或服务消费者进行管理,但管控台是否正常对Dubbo 服务没有影响,管控台也不需要高可用,因此可以单节点部署。

IP: 192.168.1.81

部署容器:Tomcat7

端口:8080

1 下载(或上传)最新版的 Tomcat7(apache-tomcat-7.0.57.tar.gz)到/home/wusc/

2 解压: 

$ tar -zxvf apache-tomcat-7.0.57.tar.gz

$ mv apache-tomcat-7.0.57 dubbo-admin-tomcat

3 移除/home/wusc/dubbo-admin-tomcat/webapps 目录下的所有文件: 

$ rm -rf *

4 上传 Dubbo 管理控制台程序 dubbo-admin-2.5.3.war到/home/wusc/dubbo-admin-tomcat/webapps

5 解压并把目录命名为 ROOT: 

$ unzip dubbo-admin-2.5.3.war -d ROOT

dubbo-admin-2.5.3.war 移到/home/wusc/tools 目录备份 

$ mv dubbo-admin-2.5.3.war /home/wusc/tools

6 配置 dubbo.properties:

[[email protected] WEB-INF]# vi dubbo.properties 
[[email protected] WEB-INF]# cat dubbo.properties | grep re
dubbo.registry.address=zookeeper://192.168.1.81:2181?backup=192.168.1.82:2182,192.168.1.83:2183


三、测试集群高可用

启用edu-zk-03,这里是从节点

[[email protected] conf]# /home/wusc/zookeeper/node-03/bin/zkServer.sh start
JMX enabled by default
Using config: /home/wusc/zookeeper/node-03/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[[email protected] conf]# /home/wusc/zookeeper/node-03/bin/zkServer.sh status
JMX enabled by default
Using config: /home/wusc/zookeeper/node-03/bin/../conf/zoo.cfg
Mode: follower


我们再关掉edu-zk-02的leader节点

[[email protected] root]# /home/wusc/zookeeper/node-02/bin/zkServer.sh stop
JMX enabled by default
Using config: /home/wusc/zookeeper/node-02/bin/../conf/zoo.cfg
Stopping zookeeper ... STOPPED


我们再查看edu-zk-03 status

[[email protected] conf]# /home/wusc/zookeeper/node-03/bin/zkServer.sh status
JMX enabled by default
Using config: /home/wusc/zookeeper/node-03/bin/../conf/zoo.cfg
Mode: leader

这时edu-zk-03变成了leader,

provider控制台信息

016-04-15 16:53:15,516  WARN [ClientCnxn.java:1089] : Session 0x15418fd25210000 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
2016-04-15 16:53:15,517 DEBUG [ClientCnxnSocketNIO.java:192] : Ignoring exception during shutdown input
java.nio.channels.ClosedChannelException
at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:798)
at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:426)
at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:189)
at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1157)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1097)
2016-04-15 16:53:15,518 DEBUG [ClientCnxnSocketNIO.java:199] : Ignoring exception during shutdown output
java.nio.channels.ClosedChannelException
at sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:815)
at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:434)
at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:196)
at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1157)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1097)
2016-04-15 16:53:16,506  INFO [ClientCnxn.java:966] : Opening socket connection to server 192.168.1.83/192.168.1.83:2183. Will not attempt to authenticate using SASL (unknown error)
2016-04-15 16:53:16,507  INFO [ClientCnxn.java:849] : Socket connection established to 192.168.1.83/192.168.1.83:2183, initiating session
2016-04-15 16:53:16,510 DEBUG [ClientCnxn.java:889] : Session establishment request sent on 192.168.1.83/192.168.1.83:2183
2016-04-15 16:53:16,515  INFO [ClientCnxn.java:1207] : Session establishment complete on server 192.168.1.83/192.168.1.83:2183, sessionid = 0x15418fd25210000,
 negotiated timeout = 30000
2016-04-15 16:53:16,515 DEBUG [ZkClient.java:351] : Received event: WatchedEvent state:SyncConnected type:None path:null
2016-04-15 16:53:16,516  INFO [ZkClient.java:449] : zookeeper state changed (SyncConnected)
2016-04-15 16:53:16,516 DEBUG [ZkEventThread.java:88] : New event: ZkEvent[State changed to SyncConnected sent to com.alib[email protected]20f8f517]
2016-04-15 16:53:16,516 DEBUG [ZkClient.java:395] : Leaving process event
2016-04-15 16:53:16,516 DEBUG [ZkEventThread.java:69] : Delivering event #2 ZkEvent[State changed to SyncConnected sent to com.alib[email protected]20f8f517]
2016-04-15 16:53:16,517 DEBUG [ZkEventThread.java:79] : Delivering event #2 done
JMX enabled by default
Using config: /home/yxq/zookeeper/node-03/bin/../conf/zoo.cfg
Mode: leader

我们再关闭edu-zk-01节点

这里provider控制台信息报错(集群节点没有过半),这时dubbo管理台是可以正常运行的,但是集群是无用,提供者服务是缓存的,如果重启dubbo管理控制台,会发现启动不了,但是如果再启动一个节点edu-zk-01或edu-zk-02,管理控制台又立即可以运行起来。

java.net.SocketException: Connection reset by peer: shutdown
at sun.nio.ch.Net.shutdown(Native Method)
at sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:819)
at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:434)
at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:196)
at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1157)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1097)
2016-04-15 16:58:09,848  INFO [ClientCnxn.java:966] : Opening socket connection to server 192.168.1.81/192.168.1.81:2181. Will not attempt to authenticate using SASL (unknown error)
2016-04-15 16:58:10,848  WARN [ClientCnxn.java:1089] : Session 0x15418fd25210000 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
2016-04-15 16:58:10,848 DEBUG [ClientCnxnSocketNIO.java:192] : Ignoring exception during shutdown input
java.nio.channels.ClosedChannelException
at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:798)
at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:426)
at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:189)
at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1157)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1097)
2016-04-15 16:58:10,849 DEBUG [ClientCnxnSocketNIO.java:199] : Ignoring exception during shutdown output
java.nio.channels.ClosedChannelException
at sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:815)
at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:434)
at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:196)
at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1157)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1097)
2016-04-15 16:58:11,552  INFO [ClientCnxn.java:966] : Opening socket connection to server 192.168.1.82/192.168.1.82:2182. Will not attempt to authenticate using SASL (unknown error)
2016-04-15 16:58:12,553  WARN [ClientCnxn.java:1089] : Session 0x15418fd25210000 for server null, unexpected error, closing socket connection and attempting reconnect
java.net.ConnectException: Connection refused: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
2016-04-15 16:58:12,554 DEBUG [ClientCnxnSocketNIO.java:192] : Ignoring exception during shutdown input
java.nio.channels.ClosedChannelException
at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:798)
at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:426)
at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:189)
at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1157)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1097)
2016-04-15 16:58:12,555 DEBUG [ClientCnxnSocketNIO.java:199] : Ignoring exception during shutdown output
java.nio.channels.ClosedChannelException
at sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:815)
at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:434)
at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:196)
at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1157)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1097)
2016-04-15 16:58:13,909  INFO [ClientCnxn.java:966] : Opening socket connection to server 192.168.1.83/192.168.1.83:2183. Will not attempt to authenticate using SASL (unknown error)
2016-04-15 16:58:13,910  INFO [ClientCnxn.java:849] : Socket connection established to 192.168.1.83/192.168.1.83:2183, initiating session
2016-04-15 16:58:13,911 DEBUG [ClientCnxn.java:889] : Session establishment request sent on 192.168.1.83/192.168.1.83:2183
2016-04-15 16:58:13,914  INFO [ClientCnxn.java:1085] : Unable to read additional data from server sessionid 0x15418fd25210000, likely server has closed socket, closing socket connection and attempting reconnect
2016-04-15 16:58:13,915 DEBUG [ClientCnxnSocketNIO.java:199] : Ignoring exception during shutdown output


四、注册中心的升级或者迁移



分布式架构学习之:027--ZooKeeper集群的安装、配置、高可用测试分布式架构学习之:027--ZooKeeper集群的安装、配置、高可用测试

分布式架构学习之:027--ZooKeeper集群的安装、配置、高可用测试分布式架构学习之:027--ZooKeeper集群的安装、配置、高可用测试

分布式架构学习之:027--ZooKeeper集群的安装、配置、高可用测试