Linux -高可用集群的基本部署
高可用集群的部署
实验环境的准备:
准备三台rhel6.5的虚拟机三台,真机作测试,做好解析。
解析
[[email protected] ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
172.25.50.10 server1.example.com
172.25.50.20 server2.example.com
172.25.50.30 server3.example.com
172.25.50.250 real50.example.com
yum源的配置
[[email protected] ~]# cat /etc/yum.repos.d/redhat6.repo
[Server]
name=rhel6.5 Server
baseurl=http://172.25.50.250/rhel6.5
gpgcheck=0
[HighAvailability]
name=rhel6.5 HighAvailability
baseurl=http://172.25.50.250/rhel6.5/HighAvailability
gpgcheck=0
[LoadBalancer]
name=rhel6.5 LoadBalancer
baseurl=http://172.25.50.250/rhel6.5/LoadBalancer
gpgcheck=0
[ResilientStorage]
name=rhel6.5 ResilientStorage
baseurl=http://172.25.50.250/rhel6.5/ResilientStorage
gpgcheck=0
[ScalableFileSystem]
name=rhel6.5 ScalableFileSystem
baseurl=http://172.25.50.250/rhel6.5/ScalableFileSystem
gpgcheck=0
[[email protected] ~]# yum repolist
Loaded plugins: product-id, subscription-manager
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
HighAvailability | 3.9 kB 00:00
HighAvailability/primary_db | 43 kB 00:00
LoadBalancer | 3.9 kB 00:00
LoadBalancer/primary_db | 7.0 kB 00:00
ResilientStorage | 3.9 kB 00:00
ResilientStorage/primary_db | 47 kB 00:00
ScalableFileSystem | 3.9 kB 00:00
ScalableFileSystem/primary_db | 6.8 kB 00:00
Server | 3.9 kB 00:00
Server/primary_db | 3.1 MB 00:00
repo id repo name status
HighAvailability rhel6.5 HighAvailability 56
LoadBalancer rhel6.5 LoadBalancer 4
ResilientStorage rhel6.5 ResilientStorage 62
ScalableFileSystem rhel6.5 ScalableFileSystem 7
Server rhel6.5 Server 3,690
repolist: 3,819
三台6.5系统的虚拟机都进行相同的配置。
安装软件
在server1上
[[email protected] yum.repos.d]# yum install ricci -y
[[email protected] yum.repos.d]# passwd ricci
更改用户 ricci 的密码 。
新的 密码:#westos
无效的密码: 它基于字典单词
无效的密码: 过于简单
重新输入新的 密码:#westos
passwd: 所有的身份验证令牌已经成功更新。
[[email protected] yum.repos.d]#/etc/init.d/ricci start#启动服务
[[email protected] yum.repos.d]# /etc/init.d/ricci start
Starting system message bus: [ OK ]
Starting oddjobd: [ OK ]
generating SSL certificates... done
Generating NSS database... done
启动 ricci: [确定]
[[email protected] yum.repos.d]#
Broadcast message from [email protected]
(unknown) at 14:48 ...
The system is going down for reboot NOW!
Connection to 172.25.50.10 closed by remote host.
Connection to 172.25.50.10 closed.
在server2上
[[email protected] yum.repos.d]# yum install ricci -y
[[email protected] yum.repos.d]# passwd ricci
更改用户 ricci 的密码 。
新的 密码:#westos
无效的密码: 它基于字典单词
无效的密码: 过于简单
重新输入新的 密码:#westos
passwd: 所有的身份验证令牌已经成功更新。
[[email protected] yum.repos.d]#/etc/init.d/ricci start#启动服务
[[email protected] yum.repos.d]# /etc/init.d/ricci start
Starting system message bus: [ OK ]
Starting oddjobd: [ OK ]
generating SSL certificates... done
Generating NSS database... done
启动 ricci: [确定]
[[email protected] yum.repos.d]#
Broadcast message from [email protected]
(unknown) at 14:48 ...
The system is going down for reboot NOW!
Connection to 172.25.50.20 closed by remote host.
Connection to 172.25.50.20 closed.
在server3上
安装软件luci
[[email protected] ~]# yum install luci -y
[[email protected] ~]# /etc/init.d/luci start
Adding following auto-detected host IDs (IP addresses/domain names), corresponding to `server3.example.com' address, to the configuration of self-managed certificate `/var/lib/luci/etc/cacert.config' (you can change them by editing `/var/lib/luci/etc/cacert.config', removing the generated certificate `/var/lib/luci/certs/host.pem' and restarting luci):
(none suitable found, you can still do it manually as mentioned above)
Generating a 2048 bit RSA private key
writing new private key to '/var/lib/luci/certs/host.pem'
Starting saslauthd: [ OK ]
Start luci... [确定]
Point your web browser to https://server3.example.com:8084 (or equivalent) to access luci
在真机的浏览器上:
https;//server3.example.com:8084
输入登陆主机的root账户和密码-->Create-->输入集群节点的信息
这里的密码是:ricci用户的密码(westos)
选项如下图所示:
名词须知: cman -- 集群管理器
rgmanger -- 资源管理器
fence -- 电源控制机
corosync
点击Fence Dvice
再Fence virt(Multicast Mode) --> Name:vmfence --> sumbit
在浏览器图形界面 选择 Nodes --> 选中 server1.example.com --> 选择 Add Fence Method --> Method Name:fence-1 --> 查找对应的 uuid 填入第一个空 ##此处用 uuid 的原因是因为真机不能识别主机名
在浏览器图形界面 选择 Nodes --> 选中 server2.example.com --> 选择 Add Fence Method --> Method Name:fence-2 --> 查找对应的 uuid 填入第一个空##此处用 uuid 的原因是因为真机不能识别主机名
填写的uuid(查看uuid命令:virsh list --uuid,这里的uuid与vmmanager中的顺序一致)
在真机上扎安装软件
fence-virtd-multicast-0.3.2-1.el7.x86_64
fence-virtd-0.3.2-2.el7.x86_64
fence-virtd-libvirt-0.3.2-2.el7.x86_64
在真机上
#mkdir /etc/cluster
[[email protected] cluster]# fence_virtd -c
Module search path [/usr/lib64/fence-virt]:
Available backends:
libvirt 0.1
Available listeners:
multicast 1.2
serial 0.4
Listener modules are responsible for accepting requests
from fencing clients.
Listener module [multicast]:
The multicast listener module is designed for use environments
where the guests and hosts may communicate over a network using
multicast.
The multicast address is the address that a client will use to
send fencing requests to fence_virtd.
Multicast IP Address [225.0.0.12]:
Using ipv4 as family.
Multicast IP Port [1229]:
Setting a preferred interface causes fence_virtd to listen only
on that interface. Normally, it listens on all interfaces.
In environments where the virtual machines are using the host
machine as a gateway, this *must* be set (typically to virbr0).
Set to 'none' for no interface.
Interface [br0]: ##不是br0的话写成br0
The key file is the shared key information which is used to
authenticate fencing requests. The contents of this file must
be distributed to each physical host and virtual machine within
a cluster.
Key File [/etc/cluster/fence_xvm.key]:
Backend modules are responsible for routing requests to
the appropriate hypervisor or management layer.
Backend module [libvirt]:
Configuration complete.
=== Begin Configuration ===
fence_virtd {
listener = "multicast";
backend = "libvirt";
module_path = "/usr/lib64/fence-virt";
}
listeners {
multicast {
key_file = "/etc/cluster/fence_xvm.key";
address = "225.0.0.12";
interface = "br0";
family = "ipv4";
port = "1229";
}
}
backends {
libvirt {
uri = "qemu:///system";
}
}
=== End Configuration ===
Replace /etc/fence_virt.conf with the above [y/N]? y
[[email protected] etc]# dd if=/dev/urandom of=/etc/cluster/fence_xvm.key bs=128 count=1
记录了1+0 的读入
记录了1+0 的写出
128字节(128 B)已复制,0.000185659 秒,689 kB/秒
# scp fence_xvm.key [email protected]:/etc/cluster/
# scp fence_xvm.key [email protected]:/etc/cluster/
如果server1.server2上没有这个目录就创建一个
最后重启systemctl restart fence_virtd 这个服务
测试:
在 server1 中执行命令: fence_node server2.example.com ##此处一定要用域名
或者
在 server2 中执行命令: fence_node server1.example.com ##此处一定要用域名
高可用集群的配置
#在真机重启之后,需要先将sevrer3的fence_virtd 服务打开
systemctl start fence_virtd
在图形界面上选择Failover Domain选项
点击add-->填写名字,下面的全选-->create
设置优先级。越小的,优先级越高
再选择resources选项-->add-->ip address-->设置虚拟ip:172.25.50.100-->掩码为:24-->Monitor Link:对勾 最后的一行为延迟时间:5 --> Submit
选择 Resources --> add --> Script --> Name:httpd --> Full Path to Script File: /etc/init.d/httpd --> Submit
选择 Service Groups --> add --> Name:apache --> 全对勾 --> Failover Domain:webfile --> Recovery Policy:Relocate --> Add Resource --> 先添加 ip address,再添加Script --> submit## Run Excluslve -- 运行独占(只允许此服务运行)
测试:
[[email protected] cluster]# clustat
Cluster Status for lyitx @ Wed Feb 15 17:08:17 2017
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
server1.example.com 1 Online, Local, rgmanager
server2.example.com 2 Online, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:apache server1.example.com started
测试:
[[email protected] cluster]# /etc/init.d/httpd stop
停止 httpd: [确定]
在把server2主机上用clustat 命令查看,
[[email protected] ~]# clustat
Cluster Status for lyitx @ Wed Feb 15 17:09:04 2017
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
server1.example.com 1 Online, rgmanager
server2.example.com 2 Online, Local, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:apache server1.example.com started
[[email protected] ~]# clustat
Cluster Status for lyitx @ Wed Feb 15 17:09:10 2017
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
server1.example.com 1 Online, rgmanager
server2.example.com 2 Online, Local, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:apache none recovering
[[email protected] ~]# clustat
Cluster Status for lyitx @ Wed Feb 15 17:09:11 2017
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
server1.example.com 1 Online, rgmanager
server2.example.com 2 Online, Local, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:apache server2.example.com starting
[[email protected] ~]# clustat
Cluster Status for lyitx @ Wed Feb 15 17:09:12 2017
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
server1.example.com 1 Online, rgmanager
server2.example.com 2 Online, Local, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:apache server2.example.com starting
经过观察后发现:服务从server1上切换到了server2
测试2:
在server2上: ip link set down eth0 #将网卡关闭
这时候在sevrer1上:
[[email protected] cluster]# clustat
Cluster Status for lyitx @ Wed Feb 15 17:21:05 2017
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
server1.example.com 1 Online, Local, rgmanager
server2.example.com 2 Offline
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:apache server1.example.com started
[[email protected] cluster]# clustat
Cluster Status for lyitx @ Wed Feb 15 17:21:11 2017
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
server1.example.com 1 Online, Local, rgmanager
server2.example.com 2 Online
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:apache server1.example.com started
测试3:
[[email protected] cluster]# echo c > /proc/sysrq-trigger
在server2中查看
[[email protected] ~]# clustat
Cluster Status for lyitx @ Wed Feb 15 17:30:31 2017
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
server1.example.com 1 Offline
server2.example.com 2 Online, Local, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:apache server1.example.com started
[[email protected] ~]# clustat
Cluster Status for lyitx @ Wed Feb 15 17:30:34 2017
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
server1.example.com 1 Offline
server2.example.com 2 Online, Local, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:apache server2.example.com starting
转载于:https://blog.51cto.com/12150355/1905719