环境搭建:
A-B双主,A主B备主(最好开启版同步复制确保数据基本完整性)
keepalived 已安装
keepalived 配置文件
A(keepalived.conf) B只修改priority 80即可
vrrp_script vs_mysql_161 { script "/data/scripts/keepalived/check_mysql.sh" interval 15 timeout 10 fall 3 rise 2 } vrrp_instance VI_161 { state BACKUP nopreempt interface eth0 virtual_router_id 161 priority 90 advert_int 1 authentication { auth_type PASS auth_pass 1111 } track_script { vs_mysql_161 } virtual_ipaddress { 192.168.56.161 } notify_master "/data/scripts/keepalived/notify.sh master" notify_backup "/data/scripts/keepalived/notify.sh backup" notify_fault "/data/scripts/keepalived/notify.sh fault" }
MySQL检查脚本
check_mysql.sh
#!/bin/bash mysql_conn="/usr/local/mysql/bin/mysql -uroot -proot -h127.0.0.1 -P3306" errFile=/tmp/keepalived.err ip=`/sbin/ifconfig | grep "192.168.56.255" | awk -F "[: ]+" '{print $4}'` call(){ for phone in xxx do #curl "http:/sendSMS.json?phone={$phone}&msg={$ip}+{$MYSQL_PORT}+{keepalived_is_change}" echo "[ `date +"%Y-%m-%d %H:%M:%S"` ] {$ip}+{$MYSQL_PORT}+{keepalived_is_change}"| tee -a $errFile done } ##检查逻辑i:查询成功 j:进程在----查询成功,进程在OK!查询失败,进程在,5s后再次查询,成功则OK,不成功则监本检测失败!进程不存在,脚本检查失败! while true do $mysql_conn -e "SELECT SYSDATE();" >/dev/null 2>&1 i=$? ps aux | grep mysqld | grep -v grep > /dev/null 2>&1 j=$? if [ $i = 0 ] && [ $j = 0 ] then exit 0 else if [ $i = 1 ] && [ $j = 0 ] then echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@"| tee -a $errFile echo "[ `date +"%Y-%m-%d %H:%M:%S"` ] MySQL query error,sleep 5s,try again!"| tee -a $errFile sleep 5 $mysql_conn -e "SELECT SYSDATE();" >/dev/null 2>&1 i1=$? if [ $i1 = 0 ] then echo "[ `date +"%Y-%m-%d %H:%M:%S"` ] MySQL again query OK!"| tee -a $errFile exit 0 else echo "[ `date +"%Y-%m-%d %H:%M:%S"` ] MySQL again query error!"| tee -a $errFile break fi else break fi fi done #此处发邮件发短信报警 call exit 1
keepalived 状态切换脚本
notify.sh
#!/bin/bash mysql_conn="/usr/local/mysql/bin/mysql -uroot -proot -h127.0.0.1 -P3306" ip=`/sbin/ifconfig | grep "192.168.56.255" | awk -F "[: ]+" '{print $4}'` errFile=/tmp/keepalived.change stat=$1 call(){ for phone in xxx do #curl "http://msg/sendSMS.json?phone={$phone}&msg={$ip}+{keepalived_is_change}+{$stat}" echo "[ `date +"%Y-%m-%d %H:%M:%S"` ] {$ip}+{keepalived_is_change}+{$stat}"| tee -a $errFile done } ###keepalived进入master状态,设置数据库只读,检查现在备主是否执行完所有relay log,执行完,取消只读,如30s还未执行完,reset slave all;强制切换成主库! ###keepalived进入backup状态,设置数据库只读! ###keepalived进入fault状态,脚本检测失败,停库,停keepalived! if [ $stat = 'master' ];then echo "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@" | tee -a $errFile i=1 $mysql_conn -e "set global read_only=1;" echo "[ `date +"%Y-%m-%d %H:%M:%S"` ] MySQL change master,set db read_only" | tee -a $errFile while [ $i -lt 10 ] do pos=`$mysql_conn -e "show slave status\G;"|egrep "Master_Log" |awk '{printf ("%s",$NF "\t")}'` read_file=`echo $pos|awk '{print $1}'` read_pos=`echo $pos|awk '{print $2}'` exec_file=`echo $pos|awk '{print $3}'` exec_pos=`echo $pos|awk '{print $4}'` echo $read_file $exec_file $read_pos $exec_pos sleep 3 let i++ if [ $read_pos = $exec_pos ] && [ $read_pos = $exec_pos ];then $mysql_conn -e "set global read_only=0;" echo "[ `date +"%Y-%m-%d %H:%M:%S"` ] MySQL slave has relay all relay log,cancel db read_only!"| tee -a $errFile break fi done if [ $i -eq 10 ];then $mysql_conn -e "stop slave;reset slave all;set global read_only=0;" echo "[ `date +"%Y-%m-%d %H:%M:%S"` ][Waring]MySQL slave not relay all relay log,cancel db read_only!"| tee -a $errFile fi elif [ $stat = 'backup' ];then $mysql_conn -e "set global read_only=1;" echo "[ `date +"%Y-%m-%d %H:%M:%S"` ] MySQL slave set db read_only!"| tee -a $errFile elif [ $stat = 'fault' ];then echo "[ `date +"%Y-%m-%d %H:%M:%S"` ] keepalived Entering FAULT STATE,stop MySQL and keepalived!"| tee -a $errFile /etc/init.d/mysql3306 stop /etc/init.d/keepalived stop fi call
测试1
主检测脚本2次脚本失败,在第三次5s后检测成功(fall 3起作用) A: cat /tmp/keepalived.err @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ [ 2016-08-04 17:22:58 ] MySQL query error,sleep 5s,try again! [ 2016-08-04 17:23:03 ] MySQL again query error! [ 2016-08-04 17:23:03 ] {192.168.56.159}+{}+{keepalived_is_change} @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ [ 2016-08-04 17:23:13 ] MySQL query error,sleep 5s,try again! [ 2016-08-04 17:23:18 ] MySQL again query error! [ 2016-08-04 17:23:18 ] {192.168.56.159}+{}+{keepalived_is_change} @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ [ 2016-08-04 17:23:28 ] MySQL query error,sleep 5s,try again! [ 2016-08-04 17:23:33 ] MySQL again query OK! cat /var/log/messages Aug 4 17:23:03 Zabbix Keepalived_vrrp[15588]: pid 17479 exited with status 1 Aug 4 17:23:18 Zabbix Keepalived_vrrp[15588]: pid 17512 exited with status 1
测试2
设置max_user_connections=2,登录2个客户端,模拟检查脚本执行错误 A: [root@192.168.56.159 keepalived]cat /tmp/keepalived.err @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ [ 2016-08-04 17:40:59 ] MySQL query error,sleep 5s,try again! [ 2016-08-04 17:41:04 ] MySQL again query error! [ 2016-08-04 17:41:04 ] {192.168.56.159}+{}+{keepalived_is_change} @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ [ 2016-08-04 17:41:14 ] MySQL query error,sleep 5s,try again! [ 2016-08-04 17:41:19 ] MySQL again query error! [ 2016-08-04 17:41:19 ] {192.168.56.159}+{}+{keepalived_is_change} @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ [ 2016-08-04 17:41:29 ] MySQL query error,sleep 5s,try again! [ 2016-08-04 17:41:34 ] MySQL again query error! [ 2016-08-04 17:41:34 ] {192.168.56.159}+{}+{keepalived_is_change} @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ [ 2016-08-04 17:41:44 ] MySQL query error,sleep 5s,try again! [ 2016-08-04 17:41:49 ] MySQL again query error! [ 2016-08-04 17:41:49 ] {192.168.56.159}+{}+{keepalived_is_change} cat /var/log/messages Aug 4 17:41:04 Zabbix Keepalived_vrrp[18675]: pid 18766 exited with status 1 Aug 4 17:41:19 Zabbix Keepalived_vrrp[18675]: pid 18795 exited with status 1 Aug 4 17:41:34 Zabbix Keepalived_vrrp[18675]: pid 18818 exited with status 1 Aug 4 17:41:34 Zabbix Keepalived_vrrp[18675]: VRRP_Script(vs_mysql_161) failed Aug 4 17:41:34 Zabbix Keepalived_vrrp[18675]: VRRP_Instance(VI_161) Entering FAULT STATE Aug 4 17:41:34 Zabbix Keepalived_vrrp[18675]: VRRP_Instance(VI_161) removing protocol VIPs. Aug 4 17:41:34 Zabbix Keepalived_vrrp[18675]: VRRP_Instance(VI_161) Now in FAULT state Aug 4 17:41:34 Zabbix Keepalived_healthcheckers[18674]: Netlink reflector reports IP 192.168.56.161 removed Aug 4 17:41:36 Zabbix ntpd[1591]: Deleting interface #17 eth0, 192.168.56.161#123, interface stats: received=0, sent=0, dropped=0, active_time=72 secs Aug 4 17:41:46 Zabbix Keepalived[18672]: Stopping Aug 4 17:41:46 Zabbix Keepalived_healthcheckers[18674]: Stopped Aug 4 17:41:47 Zabbix Keepalived_vrrp[18675]: Stopped Aug 4 17:41:47 Zabbix Keepalived[18672]: Stopped Keepalived v1.2.21 (08/03,2016) B: [root@192.168.56.160 keepalived]cat /tmp/keepalived.change @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ [ 2016-08-04 17:41:37 ] MySQL change master,set db read_only [ 2016-08-04 17:41:40 ] MySQL slave has relay all relay log,cancel db read_only! [ 2016-08-04 17:41:40 ] {192.168.56.160}+{keepalived_is_change}+{master} cat /var/log/messages Aug 4 17:41:35 redis Keepalived_vrrp[11028]: VRRP_Instance(VI_161) Transition to MASTER STATE Aug 4 17:41:37 redis Keepalived_vrrp[11028]: VRRP_Instance(VI_161) Entering MASTER STATE Aug 4 17:41:37 redis Keepalived_vrrp[11028]: VRRP_Instance(VI_161) setting protocol VIPs. Aug 4 17:41:37 redis Keepalived_vrrp[11028]: Sending gratuitous ARP on eth0 for 192.168.56.161 Aug 4 17:41:37 redis Keepalived_vrrp[11028]: VRRP_Instance(VI_161) Sending/queueing gratuitous ARPs on eth0 for 192.168.56.161 Aug 4 17:41:37 redis Keepalived_vrrp[11028]: Sending gratuitous ARP on eth0 for 192.168.56.161 Aug 4 17:41:37 redis Keepalived_vrrp[11028]: Sending gratuitous ARP on eth0 for 192.168.56.161 Aug 4 17:41:37 redis Keepalived_vrrp[11028]: Sending gratuitous ARP on eth0 for 192.168.56.161 Aug 4 17:41:37 redis Keepalived_vrrp[11028]: Sending gratuitous ARP on eth0 for 192.168.56.161 Aug 4 17:41:37 redis Keepalived_healthcheckers[11027]: Netlink reflector reports IP 192.168.56.161 added Aug 4 17:41:39 redis ntpd[1534]: Listening on interface #14 eth0, 192.168.56.161#123 Enabled Aug 4 17:41:42 redis Keepalived_vrrp[11028]: Sending gratuitous ARP on eth0 for 192.168.56.161 Aug 4 17:41:42 redis Keepalived_vrrp[11028]: VRRP_Instance(VI_161) Sending/queueing gratuitous ARPs on eth0 for 192.168.56.161 Aug 4 17:41:42 redis Keepalived_vrrp[11028]: Sending gratuitous ARP on eth0 for 192.168.56.161 Aug 4 17:41:42 redis Keepalived_vrrp[11028]: Sending gratuitous ARP on eth0 for 192.168.56.161 Aug 4 17:41:42 redis Keepalived_vrrp[11028]: Sending gratuitous ARP on eth0 for 192.168.56.161 Aug 4 17:41:42 redis Keepalived_vrrp[11028]: Sending gratuitous ARP on eth0 for 192.168.56.161
测试3
设置max_user_connections=2,登录2个客户端,模拟检查脚本执行错误 此时B可以执行 FLUSH TABLES WITH READ LOCK; 模拟从库延时 Seconds_Behind_Master: 53 B: cat /tmp/keepalived.change @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ [ 2016-08-04 17:51:00 ] MySQL change master,set db read_only [ 2016-08-04 17:51:27 ][Waring]MySQL slave not relay all relay log,cancel db read_only! [ 2016-08-04 17:51:27 ] {192.168.56.160}+{keepalived_is_change}+{master} notify.sh脚本中,30s如备主还延迟则 "stop slave;reset slave all;set global read_only=0;" !!!慎用慎用!!!