本文出自 “朱超博” 博客,请务必保留此出处http://zhuchaobo.blog.51cto.com/4393935/885665
drbd+corosync+pacemaker实现mysql的高可用性“下”
16、定义群集服务及资源(node1)
[root@node1 ~]# drbdadm primary mysql
[root@node2 ~]# drbdadm secondary mysql
1、查看当前集群的配置信息,确保已经配置全局属性参数为两节点集群所适用
[root@node1 ~]# crm configure show
node node1.abc.com
node node2.abc.com
property $id="cib-bootstrap-options" \
dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"
2、将已经配置好的DRBD设备/dev/drbd0定义为集群服务;
[root@node1~]# service drbd stop
[root@node1 ~]# chkconfig drbd off
[root@node1 ~]# ssh node2.abc.com "service drbd stop"
[root@node1 ~]# ssh node2.abc.com "chkconfig drbd off"
[root@node1 ~]# drbd-overview
drbd not loaded
3、配置drbd为集群资源:
提供drbd的RA目前由OCF归类为linbit,其路径为/usr/lib/ocf/resource.d/linbit/drbd。我们可以使用如下命令来查看此RA及RA的meta信息:
node1:
[root@node1 ~]# drbd-overview
drbd not loaded
[root@node1 ~]# crm ra classes
heartbeat
lsb
ocf / heartbeat linbit pacemaker
stonith
[root@node1 ~]# crm ra list ocf linbit
drbd
4、查看drbd的资源代理的相关信息:
node1:
[root@node1 ~]# crm ra info ocf:linbit:drbd
This resource agent manages a DRBD resource
as a master/slave resource. DRBD is a shared-nothing replicated storage
device. (ocf:linbit:drbd)
Master/Slave OCF Resource Agent for DRBD
Parameters (* denotes required, [] the default):
drbd_resource* (string): drbd resource name
The name of the drbd resource from the drbd.conf file.
drbdconf (string, [/etc/drbd.conf]): Path to drbd.conf
Full path to the drbd.conf file.
Operations' defaults (advisory minimum):
start timeout=240
promote timeout=90
demote timeout=90
notify timeout=90
stop timeout=100
monitor_Slave interval=20 timeout=20 start-delay=1m
monitor_Master interval=10 timeout=20 start-delay=1m
5、drbd需要同时运行在两个节点上,但只能有一个节点(primary/secondary模型)是Master,而另一个节点为Slave;因此,它是一种比较特殊的集群资源,其资源类型为多状态(Multi-state)clone类型,即主机节点有Master和Slave之分,且要求服务刚启动时两个节点都处于slave状态。
[root@node1 ~]# crm
crm(live)# configure
crm(live)configure# primitive mysqldrbd ocf:heartbeat:drbd params drbd_resource="mysql" op monitor role="Master" interval="30s" op monitor role="Slave" interval="31s" op start timeout="240s" op stop timeout="100s"
crm(live)configure# ms MS_mysqldrbd mysqldrbd meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify="true"
crm(live)configure# show mysqldrbd
primitive mysqldrbd ocf:heartbeat:drbd \
params drbd_resource="mysql" \
op monitor interval="30s" role="Master" \
op monitor interval="31s" role="Slave" \
op start interval="0" timeout="240s" \
op stop interval="0" timeout="100s"
crm(live)configure# show MS_mysqldrbd
ms MS_mysqldrbd mysqldrbd \
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
确定无误后,提交:
crm(live)configure# verify
crm(live)configure# commit
crm(live)configure# exit
bye
6、查看当前集群运行状态:
[root@node1 ~]# crm status
============
Last updated: Tue Apr 3 16:09:32 2012
Stack: openais
Current DC: node1.abc.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
1 Resources configured.
============
Online: [ node1.abc.com node2.abc.com ]
Master/Slave Set: MS_mysqldrbd [mysqldrbd]
Masters: [ node1.abc.com ]
Slaves: [ node2.abc.com ]
由上面的信息可以看出此时的drbd服务的Primary节点为node1.abc.com,Secondary节点为node2.abc.com。当然,也可以在node1上使用如下命令验正当前主机是否已经成为mysql资源的Primary节点:
[root@node1 ~]# drbdadm role mysql
Primary/Secondary
我们实现将drbd设置自动挂载至/mysqldata目录。此外,此自动挂载的集群资源需要运行于drbd服务的Master节点上,并且只能在drbd服务将某节点设置为Primary以后方可启动。
确保两个节点上的设备已经卸载:
[root@node1 ~]# umount /dev/drbd0
[root@node2 ~]# umount /dev/drbd0
以下还在node1上操作:
[root@node1 ~]# crm
crm(live)# configure
crm(live)configure# primitive MysqlFS ocf:heartbeat:Filesystem params device="/dev/drbd0" directory="/mysqldata" fstype="ext3" op start timeout=60s op stop timeout=60s
crm(live)configure# commit
crm(live)configure# exit
bye
7、mysql资源的定义(node1上操作)
先为mysql集群创建一个IP地址资源,通过集群提供服务时使用,这个地址就是客户端访问mysql服务器使用的ip地址;
[root@node1 ~]# crm configure primitive myip ocf:heartbeat:IPaddr params ip=192.168.1.50
配置mysqld服务为高可用资源:
[root@node1 ~]# crm configure primitive mysqlserver lsb:mysqld
[root@node1 ~]# crm status
============
Last updated: Tue Apr 3 16:27:13 2012
Stack: openais
Current DC: node1.abc.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
4 Resources configured.
============
Online: [ node1.abc.com node2.abc.com ]
Master/Slave Set: MS_mysqldrbd [mysqldrbd]
Masters: [ node1.abc.com ]
Slaves: [ node2.abc.com ]
MysqlFS (ocf::heartbeat:Filesystem): Started node1.abc.com
myip (ocf::heartbeat:IPaddr): Started node2.abc.com
mysqlserver (lsb:mysqld): Started node1.abc.com
8、配置资源的各种约束:
集群拥有所有必需资源,但它可能还无法进行正确处理。资源约束则用以指定在哪些群集节点上运行资源,以何种顺序装载资源,以及特定资源依赖于哪些其它资源。pacemaker共给我们提供了三种资源约束方法:
1)Resource Location(资源位置):定义资源可以、不可以或尽可能在哪些节点上运行
2)Resource Collocation(资源排列):排列约束用以定义集群资源可以或不可以在某个节点上同时运行
3)Resource Order(资源顺序):顺序约束定义集群资源在节点上启动的顺序。
定义约束时,还需要指定分数。各种分数是集群工作方式的重要组成部分。其实,从迁移资源到决定在已降级集群中停止哪些资源的整个过程是通过以某种方式修改分数来实现的。分数按每个资源来计算,资源分数为负的任何节点都无法运行该资源。在计算出资源分数后,集群选择分数最高的节点。INFINITY(无穷大)目前定义为 1,000,000。加减无穷大遵循以下3个基本规则:
1)任何值 + 无穷大 = 无穷大
2)任何值 - 无穷大 = -无穷大
3)无穷大 - 无穷大 = -无穷大
定义资源约束时,也可以指定每个约束的分数。分数表示指派给此资源约束的值。分数较高的约束先应用,分数较低的约束后应用。通过使用不同的分数为既定资源创建更多位置约束,可以指定资源要故障转移至的目标节点的顺序。
我们要定义如下的约束:
[root@node1 ~]# crm
crm(live)# configure
crm(live)configure# colocation MysqlFS_with_mysqldrbd inf: MysqlFS MS_mysqldrbd:Master myip mysqlserver
crm(live)configure# order MysqlFS_after_mysqldrbd inf: MS_mysqldrbd:promote MysqlFS:start
crm(live)configure# order myip_after_MysqlFS mandatory: MysqlFS myip
crm(live)configure# order mysqlserver_after_myip mandatory: myip mysqlserver
验证是否有错:
crm(live)configure# verify
提交:
crm(live)configure# commit
crm(live)configure# exit
查看配置信息:
[root@node1 ~]# crm configure show
node node1.abc.com
node node2.abc.com
primitive MysqlFS ocf:heartbeat:Filesystem \
params device="/dev/drbd0" directory="/mysqldata" fstype="ext3" \
op start interval="0" timeout="60s" \
op stop interval="0" timeout="60s"
primitive myip ocf:heartbeat:IPaddr \
params ip="192.168.1.50"
primitive mysqldrbd ocf:heartbeat:drbd \
params drbd_resource="mysql" \
op monitor interval="30s" role="Master" \
op monitor interval="31s" role="Slave" \
op start interval="0" timeout="240s" \
op stop interval="0" timeout="100s"
primitive mysqlserver lsb:mysqld
ms MS_mysqldrbd mysqldrbd \
meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
colocation MysqlFS_with_mysqldrbd inf: MysqlFS MS_mysqldrbd:Master myip mysqlserver
order MysqlFS_after_mysqldrbd inf: MS_mysqldrbd:promote MysqlFS:start
order myip_after_MysqlFS inf: MysqlFS myip
order mysqlserver_after_myip inf: myip mysqlserver
property $id="cib-bootstrap-options" \
dc-version="1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"
查看运行状态:
[root@node1 ~]# crm status
============
Last updated: Tue Apr 3 16:38:08 2012
Stack: openais
Current DC: node1.abc.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
4 Resources configured.
============
Online: [ node1.abc.com node2.abc.com ]
Master/Slave Set: MS_mysqldrbd [mysqldrbd]
Masters: [ node1.abc.com ]
Slaves: [ node2.abc.com ]
MysqlFS (ocf::heartbeat:Filesystem): Started node1.abc.com
myip (ocf::heartbeat:IPaddr): Started node1.abc.com
mysqlserver (lsb:mysqld): Started node1.abc.com
可见,服务现在在node1上正常运行:
在node1上的操作,查看mysql的运行状态:
[root@node1 ~]# service mysqld status
MySQL running (4624) [ OK ]
查看是否自动挂载:
[root@node1 ~]# mount
/dev/sda2 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/sda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
/dev/hdc on /mnt/cdrom type iso9660 (ro)
/dev/drbd0 on /mysqldata type ext3 (rw) #已挂载
查看目录:
[root@node1 ~]# ls /mysqldata/
data lost+found node1
查看vip的状态
[root@node1 ~]# ifconfig
eth0 Link encap:Ethernet HWaddr 00:0C:29:81:AC:41
inet addr:192.168.1.10 Bcast:192.168.1.255 Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe81:ac41/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:990984 errors:0 dropped:0 overruns:0 frame:0
TX packets:980573 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1250514600 (1.1 GiB) TX bytes:1164228497 (1.0 GiB)
Interrupt:67 Base address:0x2000
eth0:0 Link encap:Ethernet HWaddr 00:0C:29:81:AC:41
inet addr:192.168.1.50 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:67 Base address:0x2000
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:675 errors:0 dropped:0 overruns:0 frame:0
TX packets:675 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:66714 (65.1 KiB) TX bytes:66714 (65.1 KiB)
继续测试:
在node1上操作,让node1下线:
[root@node1 ~]# crm node standby
查看集群运行的状态:
[root@node1 ~]# crm status
============
Last updated: Tue Apr 3 16:43:31 2012
Stack: openais
Current DC: node1.abc.com - partition with quorum
Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, 2 expected votes
4 Resources configured.
============
Node node1.abc.com: standby
Online: [ node2.abc.com ]
Master/Slave Set: MS_mysqldrbd [mysqldrbd]
Masters: [ node2.abc.com ]
Stopped: [ mysqldrbd:0 ]
MysqlFS (ocf::heartbeat:Filesystem): Started node2.abc.com
myip (ocf::heartbeat:IPaddr): Started node2.abc.com
mysqlserver (lsb:mysqld): Started node2.abc.com
可见我们的资源已经都切换到了node2上:
查看node2的运行状态:
[root@node2 ~]# service mysqld status
MySQL running (4396) [ OK ]
查看目录:
[root@node2 ~]# ll /mysqldata/
total 20
drwxr-xr-x 5 mysql mysql 4096 Apr 3 16:44 data
drwx------ 2 root root 16384 Apr 3 13:00 lost+found
-rw-r--r-- 1 root root 0 Apr 3 13:02 node1
现在一切正常,我们可以验证mysql服务是否能被正常访问:
首先,在node2上面建立一个test用户,密码:123456.
我们定义的是通过VIP:192.168.1.50来访问mysql服务,现在node2上建立一个可以让某个网段主机能访问的账户(这个内容会同步drbd设备同步到node1上):
[root@node2 ~]# mysql
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.5.15-log MySQL Community Server (GPL)
Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> grant all on *.* to test@'192.168.%.%' identified by '123456';
Query OK, 0 rows affected (0.22 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.04 sec)
mysql> exit;
Bye
然后我们在node1上进行访问:
[root@node1 ~]# mysql -u test -h 192.168.1.50
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 3
Server version: 5.5.15-log MySQL Community Server (GPL)
Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| test |
+--------------------+
2 rows in set (0.17 sec)
mysql>
至此mysql的高可用性的群集就已成功实现;