实验环境
一:搭建主主复制环境
1.1实验环境
两台机器事先都已经装好了MySQL单实例。
IP: 10.192.203.201 10.192.203.202
端口都是3307.
二者的端口号需要保持一致,否则在最后用vip连接的时候,不能使用相同端口号连接。
1.2实验步骤
1.2.1修改配置文件
修改master1:
在[mysqld]下面添加:
server-id = 1
relay-log=/data/server/mysql_3307/binlog/ZabbixServer-relay-bin
relay-log-index=/data/server/mysql_3307/binlog/ZabbixServer-relay-bin.index
auto-increment-offset= 1
auto-increment-increment= 2
log-slave-updates=true
修改master2:
在[mysqld]下面添加:
server-id = 3
relay-log=/data/server/mysql/binlog/single-relay-bin
relay-log-index=/data/server/mysql/binlog/single-relay-bin.index
auto-increment-offset= 2
auto-increment-increment= 2
log-slave-updates=true
添加auto-increment-offset那两项,是为了避免在MySQLINSERT时主键冲突。
修改完后记得重启mysql
1.2.2建复制用户
分别在两台mysql上执行
GRANT REPLICATION SLAVE ON *.* TO 'RepUser'@'%'identified by 'beijing';
1.2.3指向master
两台服务器均为新建立,且无其它写入操作,各服务器只需记录当前自己二进制日志文件及事件位置,以之作为另外的服务器复制起始位置即可。否则,需要先备份主库,在备库进行恢复,从而保持数据一致,然后再指向master。
Master1:
mysql>show master status;
+------------------+----------+--------------+------------------+-------------------+
|File |Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
|mysql-bin.000001 | 302| | | |
+------------------+----------+--------------+------------------+-------------------+
1 row inset (0.00 sec)
Master2:
mysql>show master status;
+------------------+----------+--------------+------------------+-------------------+
|File |Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
|mysql-bin.000001 | 120| | | |
+------------------+----------+--------------+------------------+-------------------+
1 row inset (0.00 sec)
#Master1指向Master2
1. CHANGE MASTER TO MASTER_USER='RepUser',MASTER_HOST='10.192.203.202',MASTER_PASSWORD='beijing',MASTER_PORT=3307,MASTER_LOG_FILE='mysql-bin.000001',MASTER_LOG_POS=120;
#Master2指向Master1
[1. CHANGE MASTER TO MASTER_USER='RepUser',MASTER_HOST='10.192.203.201',MASTER_PASSWORD='beijing', MASTER_PORT=3307,MASTER_LOG_FILE='mysql-bin.000001',MASTER_LOG_POS=302;
1.2.4分别启动slave
startslave ;
确保show slave status
Slave_IO_Running:Yes
Slave_SQL_Running:Yes
测试两边是否同步,略。
二 配心跳
每个主机分别带有两块以太网卡,其中一块用于网络通信,另一块用于心跳功能。
本实验都是在Oracle virtualbox虚拟机里做的,故添加一块儿用于内部连接的网卡,用于心跳测试,请参考:http://blog.csdn.net/yabingshi_tech/article/details/51445006
三:安装部署heartbeat
在两台机器上分别做以下操作:
3.1 安装依赖包
yum install PyXML cluster-glue cluster-glue-libs resource-agents -y
3.2 安装heartbeat
wget http://dl.fedoraproject.org/pub/epel/6/x86_64/heartbeat-3.0.4-2.el6.x86_64.rpm
wget http://dl.fedoraproject.org/pub/epel/6/x86_64/heartbeat-libs-3.0.4-2.el6.x86_64.rpm
rpm -ivh heartbeat-*
3.3 配置heartbeat
复制配置文件
cp /usr/share/doc/heartbeat-3.0.4/authkeys /etc/ha.d/
cp /usr/share/doc/heartbeat-3.0.4/haresources /etc/ha.d/
cp /usr/share/doc/heartbeat-3.0.4/ha.cf /etc/ha.d/
3.3.1 配置心跳的加密方式:authkeys
vi /etc/ha.d/authkeys
#如果使用双机对联线(双绞线),可以配置如下:
auth 1
1 crc
#存盘退出,然后
chmod 600 /etc/ha.d/authkeys
/*
介绍:
需要配置的第三个文件authkeys决定了您的认证密钥。共有三种认证方式:crc,md5,和sha1。您可能会问:“我应该用哪个方法呢?”简而言之: 如果您的Heartbeat运行于安全网络之上,如本例中的交叉线,可以使用crc,从资源的角度来看,这是代价最低的方法。如果网络并不安全,但您也希望降低CPU使用,则使用md5。最后,如果您想得到最好的认证,而不考虑CPU使用情况,则使用sha1,它在三者之中最难破解。
文件格式如下:
auth
[]
因此,对于sha1,示例的/etc/ha.d/authkeys可能是
auth 1
1 sha1 key-for-sha1-any-text-you-want
对于md5,只要将上面内容中的sha1换成md5就可以了。 对于crc,可作如下配置:
auth 2
2 crc
不论您在关键字auth后面指定的是什么索引值,在后面必须要作为键值再次出现。如果您指定“auth 4”,则在后面一定要有一行的内容为“4 ”。
*/
3.3.2 配置心跳的监控:haresources
vi /etc/ha.d/haresources
#各主机这部分应完全相同。
添加:
PC IPaddr::10.192.203.203
#注意,PC这写你的master的主机名,Ipaddr写的是你的VIP地址。
也可设置heartbeat管理的资源或服务:在该目录下存放服务启动脚本(例如:mysql),将相同脚本名称添到/etc/ha.d/haresources内容中,从而跟随heartbeat启动而启动该脚本。
如:PC IPaddr::10.192.203.203 mysql #
但是,这样当heartbeat关闭的时候,也会关闭mysql,所以这里我就不添加了。
3.3.3 配置心跳的配置文件:ha.cf
主和从机器除了ucast eth1 10.0.0.2这一行不同外,其他都一样。
vi /etc/ha.d/ha.cf
添加:
logfile /var/log/ha_log/ha-log.log ## ha的日志文件记录位置。如没有该目录,则需要手动添加
bcast eth1 ##使用eht1做心跳监测
ucast eth1 10.0.0.2 ##心跳网卡连接对方心跳地址
keepalive 2 ##设定心跳(监测)时间时间为2秒
warntime 10
deadtime 30
initdead 120
hopfudge 1
udpport 694 ##使用udp端口694 进行心跳监测
auto_failback off
node PC ##节点1,必须要与 uname -n 指令得到的结果一致。
node slave2 ##节点2
ping 10.192.203.254 ##通过ping 网关来监测心跳是否正常。
3.3.4 创建日志文件路径
mkdir -p /var/log/ha_log
chmod 777 /var/log/ha_log/
3.4 开放防火墙端口
heartbeat 默认使用udp 694端口进行心跳监测。 如果系统有使用iptables 做防火墙,应记住把这个端口打开。
vi /etc/sysconfig/iptables
添加:-A INPUT -pudp --dport 694 -j ACCEPT
service iptables restart
3.5 HA服务的启动、关闭以及测试
启动HA: service heartbeat start
在主从都启动heartbeat
[root@PC init.d]# service heartbeat start
Starting High-Availability services:INFO: Resource is stopped
Done.
[root@PC ha_log]# service heartbeat status
heartbeat OK [pid 17943 et al] is runningon pc [pc]...
[root@slave2 ha_log]# service heartbeat status
heartbeat OK [pid 6536 et al] is running onslave2 [slave2]...
在主上看到虚拟IP了:
[root@PC ha_log]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0:<BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen1000
link/ether 08:00:27:04:05:16 brd ff:ff:ff:ff:ff:ff
inet 10.192.203.201/24 brd 10.192.203.255 scope global eth0
inet 10.192.203.203/24 brd 10.192.203.255 scope global secondary eth0
inet6 fe80::a00:27ff:fe04:516/64 scope link tentative dadfailed
valid_lft forever preferred_lft forever
3: eth1:<BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen1000
link/ether 08:00:27:3a:ec:3c brd ff:ff:ff:ff:ff:ff
inet 10.0.0.1/24 brd 10.0.0.255 scope global eth1
inet6 fe80::a00:27ff:fe3a:ec3c/64 scope link tentative dadfailed
valid_lft forever preferred_lft forever
在/var/log/ha_log下的日志文件或者/var/log/messages 都可以看到相关信息。
[root@PC network-scripts]# tail -f /var/log/messages
May 19 01:34:59 PCResourceManager(default)[17985]: info: Running /etc/ha.d/resource.d/IPaddr10.192.203.203 start
May 19 01:35:00 PCIPaddr(IPaddr_10.192.203.203)[18103]: INFO: Adding inet address10.192.203.203/24 with broadcast address 10.192.203.255 to device eth0
May 19 01:35:00 PCIPaddr(IPaddr_10.192.203.203)[18103]: INFO: Bringing device eth0 up
May 19 01:35:00 PCIPaddr(IPaddr_10.192.203.203)[18103]: INFO: /usr/libexec/heartbeat/send_arp -i200 -r 5 -p /var/run/resource-agents/send_arp-10.192.203.203 eth010.192.203.203 auto not_used not_used
May 19 01:35:00 PC/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.192.203.203)[18089]:INFO: Success
May 19 01:35:00 PCResourceManager(default)[17985]: info: Running /etc/init.d/mysql start
May 19 01:35:03 PC heartbeat: [17972]:info: local HA resource acquisition completed (standby).
May 19 01:35:03 PC heartbeat: [17943]:info: Standby resource acquisition done [foreign].
May 19 01:35:03 PC heartbeat: [17943]:info: Initial resource acquisition complete (auto_failback)
May 19 01:35:03 PC heartbeat: [17943]:info: remote resource transition completed.
测试:
将主201上的心跳关闭
[root@PC ha_log]# service heartbeat stop
Stopping High-Availability services: Done.
查看日志:
May 19 01:46:57 PC heartbeat: [18561]: info:Giving up all HA resources.
May 19 01:46:58 PCResourceManager(default)[18574]: info: Releasing resource group: pcIPaddr::10.192.203.203 mysql
May 19 01:46:58 PCResourceManager(default)[18574]: info: Running /etc/init.d/mysql stop
May 19 01:46:59 PC ResourceManager(default)[18574]:info: Running /etc/ha.d/resource.d/IPaddr 10.192.203.203 stop
May 19 01:46:59 PCIPaddr(IPaddr_10.192.203.203)[18652]: INFO: IP status = ok, IP_CIP=
May 19 01:46:59 PC/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.192.203.203)[18638]:INFO: Success
May 19 01:46:59 PC heartbeat: [18561]:info: All HA resources relinquished.
May 19 01:47:00 PC heartbeat: [17943]:WARN: 1 lost packet(s) for [slave2] [2777:2779]
May 19 01:47:00 PC heartbeat: [17943]:info: No pkts missing from slave2!
May 19 01:47:01 PC heartbeat: [17943]:info: killing HBWRITE process 17949 with signal 15
May 19 01:47:01 PC heartbeat: [17943]:info: killing HBREAD process 17950 with signal 15
May 19 01:47:01 PC heartbeat: [17943]:info: killing HBWRITE process 17951 with signal 15
May 19 01:47:01 PC heartbeat: [17943]:info: killing HBREAD process 17952 with signal 15
May 19 01:47:01 PC heartbeat: [17943]:info: killing HBFIFO process 17946 with signal 15
May 19 01:47:01 PC heartbeat: [17943]:info: killing HBWRITE process 17947 with signal 15
May 19 01:47:01 PC heartbeat: [17943]:info: killing HBREAD process 17948 with signal 15
May 19 01:47:01 PC heartbeat: [17943]:info: Core process 17951 exited. 7 remaining
May 19 01:47:02 PC heartbeat: [17943]:info: Core process 17946 exited. 6 remaining
May 19 01:47:02 PC heartbeat: [17943]:info: Core process 17947 exited. 5 remaining
May 19 01:47:02 PC heartbeat: [17943]:info: Core process 17948 exited. 4 remaining
May 19 01:47:02 PC heartbeat: [17943]:info: Core process 17949 exited. 3 remaining
May 19 01:47:02 PC heartbeat: [17943]:info: Core process 17950 exited. 2 remaining
May 19 01:47:02 PC heartbeat: [17943]:info: Core process 17952 exited. 1 remaining
May 19 01:47:02 PC heartbeat: [17943]:info: pc Heartbeat shutdown complete.
查看从202的日志:
harc(default)[8578]: 2016/05/19_01:47:00 info: Running /etc/ha.d//rc.d/statusstatus
mach_down(default)[8595]: 2016/05/19_01:47:00 info: Taking overresource group IPaddr::10.192.203.203
ResourceManager(default)[8622]: 2016/05/19_01:47:00 info: Acquiring resourcegroup: pc IPaddr::10.192.203.203 mysql
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.192.203.203)[8650]: 2016/05/19_01:47:01 INFO: Resource is stopped
ResourceManager(default)[8622]: 2016/05/19_01:47:01 info: Running/etc/ha.d/resource.d/IPaddr 10.192.203.203 start
IPaddr(IPaddr_10.192.203.203)[8746]: 2016/05/19_01:47:01 INFO: Adding inet address10.192.203.203/24 with broadcast address 10.192.203.255 to device eth0
IPaddr(IPaddr_10.192.203.203)[8746]: 2016/05/19_01:47:01 INFO: Bringing device eth0up
IPaddr(IPaddr_10.192.203.203)[8746]: 2016/05/19_01:47:01 INFO:/usr/libexec/heartbeat/send_arp -i 200 -r 5 -p/var/run/resource-agents/send_arp-10.192.203.203 eth0 10.192.203.203 autonot_used not_used
/usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_10.192.203.203)[8732]: 2016/05/19_01:47:01 INFO: Success
ResourceManager(default)[8622]: 2016/05/19_01:47:02 info: Running/etc/init.d/mysql start
mach_down(default)[8595]: 2016/05/19_01:47:05 info: /usr/share/heartbeat/mach_down:nice_failback: foreign resources acquired
mach_down(default)[8595]: 2016/05/19_01:47:05 info: mach_down takeovercomplete for node pc.
May 19 01:47:05 slave2 heartbeat: [6536]:info: mach_down takeover complete.
May 19 01:47:31 slave2 heartbeat: [6536]:WARN: node pc: is dead
May 19 01:47:31 slave2 heartbeat: [6536]:info: Dead node pc gave up resources.
May 19 01:47:31 slave2 heartbeat: [6536]:info: Link pc:eth1 dead.
显示202接管成功了。
在202上能看到vip已经漂移过来:
[root@slave2 ha_log]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP>mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:04:05:16 brd ff:ff:ff:ff:ff:ff
inet 10.192.203.202/24 brd 10.192.203.255 scope global eth0
inet 10.192.203.203/24 brd 10.192.203.255 scope global secondary eth0
inet6 fe80::a00:27ff:fe04:516/64 scope link
valid_lft forever preferred_lft forever
3: eth1:<BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen1000
link/ether 08:00:27:3a:ec:3c brd ff:ff:ff:ff:ff:ff
inet 10.0.0.2/24 brd 10.0.0.255 scope global eth1
inet6 fe80::a00:27ff:fe3a:ec3c/64 scope link
valid_lft forever preferred_lft forever
201已经没有vip
[root@PC ha_log]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0:<BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen1000
link/ether08:00:27:04:05:16 brd ff:ff:ff:ff:ff:ff
inet 10.192.203.201/24 brd 10.192.203.255 scope global eth0
inet6 fe80::a00:27ff:fe04:516/64 scope link tentative dadfailed
valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP>mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 08:00:27:3a:ec:3c brd ff:ff:ff:ff:ff:ff
inet 10.0.0.1/24 brd 10.0.0.255 scope global eth1
inet6 fe80::a00:27ff:fe3a:ec3c/64 scope link tentative dadfailed
valid_lft forever preferred_lft forever
四:heartbeat+mysql实现高可用
heartbeat只检测心跳也就是只检测设备是否宕机,不会检测MySQL服务,所以我们同样要有一个脚本来检测MySQL服务,如果mysql服务宕掉,则kill掉heartbeat进程实现故障转移(和nginx+keepalived原理一致),脚本内容如下:
分别在master1,master2上新建检查mysql脚本
vi /root/check_mysql.sh
MYSQL=/usr/local/mysql/bin/mysql
MYSQL_HOST=localhost
MYSQL_USER=root
MYSQL_PASSWORD=system@123
$MYSQL -h $MYSQL_HOST -u $MYSQL_USER-p$MYSQL_PASSWORD -e "show status;" >/dev/null 2>&1
#$mysqlclient --host=$host --port=$port--user=$user --password=$password -e"show databases;" > /dev/null 2>&1
if [ $? == 0 ]
then
echo " $host mysql login successfully "
exit 0
else
#echo " $host mysql login faild"
/etc/init.d/heartbeat stop
exit 2
fi
这个脚本待写一些邮件通知的操作。
chmod +x /root/check_mysql.sh
设置成定时任务,每分钟检查一次:
*/1 * * * * /root/check_mysql.sh >>/root/check_mysql.log
关闭当前主的mysql,验证下vip是否漂移到了从。
写好的shell脚本:
--安装heartbeat
[root@slave2 shell_script]# cat install_heartbeat.sh
#配置好两台服务器的心跳后,开始安装
#确保先在/download目录下上传所需安装包PyXML-0.8.4-19.el6.x86_64.rpm,cluster-glue-libs-1.0.5-6.el6.x86_64.rpm,cluster-glue-1.0.5-6.el6.x86_64.rpm,resource-agents-3.9.5-24.el6_7.1.x86_64.rpm
#heartbeat-libs-3.0.4-2.el6.x86_64.rpm,heartbeat-3.0.4-2.el6.x86_64.rpm
#执行该脚本时,请传入你选定的虚拟IP及mysql用户名密码,形如:sh install_heartbeat.sh '10.192.203.203' '1234'
#注意,执行完该脚本后,需要手动修改下配置心跳的文件:/etc/ha.d/ha.cf,然后手动启动heartbeat:service heartbeat start,然后用ip addr观察下两台机器的vi是否配置成功,是否能实现自动故障转移。
#设置待传的参数
arg1=$(echo $1)
arg2=$(echo $2)
if [ "$#" -ne "2" ];then
echo "Error:"
echo " You provided $# parameters,but 2 are required."
echo ' please provide the vip and password of mysql root account!'
exit 0
fi
#安装所需安装包
mkdir -p /download
cd /download
rpm -ivh PyXML-0.8.4-19.el6.x86_64.rpm
rpm -ivh cluster-glue-libs-1.0.5-6.el6.x86_64.rpm
rpm -ivh cluster-glue-1.0.5-6.el6.x86_64.rpm
rpm -ivh resource-agents-3.9.5-24.el6_7.1.x86_64.rpm
rpm -ivh heartbeat-libs-3.0.4-2.el6.x86_64.rpm
rpm -ivh heartbeat-3.0.4-2.el6.x86_64.rpm
#复制配置文件
cp /usr/share/doc/heartbeat-3.0.4/authkeys /etc/ha.d/
cp /usr/share/doc/heartbeat-3.0.4/haresources /etc/ha.d/
cp /usr/share/doc/heartbeat-3.0.4/ha.cf /etc/ha.d/
#配置心跳的加密方式
echo -e 'auth 2\n2 sha1 hi!' >> /etc/ha.d/authkeys
chmod 600 /etc/ha.d/authkeys
#配置心跳的监控
echo -e 'PC IPaddr::'$arg1 >> /etc/ha.d/haresources
: << !
vi /etc/ha.d/ha.cf
这个文件内容需要结合实际情况,填写。这里就不写了。需要手动修改该文件。形如:
logfile /var/log/ha_log/ha-log.log ## ha的日志文件记录位置。如没有该目录,则需要手动添加
bcast eth1 ##使用eht1做心跳监测
ucast eth1 10.0.0.2 ##心跳网卡连接对方心跳地址
keepalive 2 ##设定心跳(监测)时间时间为2秒
warntime 10
deadtime 30
initdead 120
hopfudge 1
udpport 694 ##使用udp端口694 进行心跳监测
auto_failback off
node PC ##节点1,必须要与 uname -n 指令得到的结果一致。
node slave2 ##节点2
ping 10.192.203.254 ##通过ping 网关来监测心跳是否正常。
或者:
logfile /var/log/ha-log
keepalive 2
deadtime 30
warntime 10
initdead 120
udpport 6941
mcast bond0 239.0.1.1 2694 1 0
bcast bond0 # Linux
auto_failback off
node basicOS-01
node basicOS-06
!
#创建日志文件路径
mkdir -p /var/log/ha_log
chmod 777 /var/log/ha_log/
#创建监控Mysql脚本,当主库mysql宕机后,关闭其heartbeat,从而实现故障转移
echo -e 'MYSQL=/usr/local/mysql/bin/mysql\nMYSQL_HOST=localhost\nMYSQL_USER=root\nMYSQL_PASSWORD='$arg2'\n\n$MYSQL -h $MYSQL_HOST -u $MYSQL_USER-p$MYSQL_PASSWORD -e "show status;" >/dev/null 2>&1\nif [ $? == 0 ]\nthen\n exit 0\nelse\n /etc/init.d/heartbeat stop\n exit 2\nfi' >> /root/check_mysql.sh
chmod +x /root/check_mysql.sh
#添加定时任务
echo -e '*/1 * * * * /root/check_mysql.sh >>/root/check_mysql.log' >> /var/spool/cron/root
echo -e '请手动修改下配置心跳的文件:/etc/ha.d/ha.cf,内容如何填写,请参考该shell脚本。\n请手动启动heartbeat:service heartbeat start,然后用ip addr观察下两台机器的vi是否配置成功,是否能实现mysql自动故障转移。'
--相应的卸载heartbeat脚本:
[root@slave2 shell_script]# cat deinstall_heartbeat.sh
service heartbeat stop
cd /etc/ha.d
rm -rf authkeys
rm -rf haresources
rm -rf ha.cf
rm -rf /var/log/ha_log
rm -rf /root/check_mysql.sh
rm -rf /root/check_mysql.log
#删除之前设置的定时任务
sed -i /check_mysql.sh/d '/var/spool/cron/root'
本篇文章参考了以下文章:
http://www.linuxidc.com/Linux/2011-11/46764.htm
http://www.codesky.net/article/201111/173710.html
http://blog.chinaunix.net/uid-20639775-id-3337481.html
http://www.oschina.net/question/163914_31896
https://www.linuxzen.com/heartbeatshi-xian-mysqlshuang-ji-gao-ke-yong.html