==========================================================================================
一、基础介绍
==========================================================================================
1、背景描述
目前我们的高可用DB的代理层采用的是360开源的Atlas,从上线以来,已稳定运行2个多月。无论是从性能上,还是稳定性上,相比其他开源组件(amoeba、cobar、MaxScale、MySQL-Proxy等),还是很出色的。
当初我们之所以选择Atlas,主要看中它有以下优点:
(1)、基于mysql-proxy-0.8.2进行修改,代码完全开源;
(2)、比较轻量级,部署配置也比较简单;
(3)、支持DB读写分离;
(4)、支持从DB读负载均衡,并自动剔除故障从DB;
(5)、支持平滑上下线DB;
(6)、具备较好的安全机制(IP过滤、账号认证);
(7)、版本更新、问题跟进、交流圈子都比较活跃。
在测试期间以及线上问题排查过程中,得到了360 Atlas作者朱超的热心解答,在此表示感谢。有关更多Atlas的介绍,我就不一一例举,可以参考以下链接:
https://github.com/Qihoo360/Atlas/blob/master/README_ZH.md
2、总体架构图
3、系统环境
CentOS 6.3 x86_64
==========================================================================================
二、安装部署
==========================================================================================
1、需注意的地方
(1)、本次安装不使用系统默认的glib库,之前的yum安装只是为了先解决依赖库的问题;
(2)、LUA库的版本不能太高,为5.1.x即可;
(3)、glib库的版本也不能太高,为glib-2.32.x即可;
(4)、对于编译不成功的情况,注意查看下面的说明。
2、GLIB依赖的基础库安装
# yum -y install *glib*
3、LUA库安装
http://ftp.gnu.org/pub/gnu/ncurses/ncurses-5.9.tar.gz
# tar xvzf ncurses-5.9.tar.gz
# cd ncurses-5.9
# ./configure --prefix=/usr/local
# make && make install
ftp://ftp.gnu.org/gnu/readline/readline-6.2.tar.gz
# tar xvzf readline-6.2.tar.gz
# cd readline-6.2
# ./configure --prefix=/usr/local
# make && make install
http://www.lua.org/ftp/lua-5.1.5.tar.gz
# tar xvzf lua-5.1.5.tar.gz
# cd lua-5.1.5
# make linux install
注意:
修改当前目录下的“Makefile”中的 INSTALL_TOP= /usr/local为 INSTALL_TOP= /usr/local/lua
主要是为了避免与系统自带的lua库发生冲突的可能
在“src/Makefile”文件中加入“-lncurses”,完整内容如下:
linux:
$(MAKE) $(ALL) SYSCFLAGS="-DLUA_USE_LINUX" SYSLIBS="-Wl,-E -ldl -lncurses -lreadline"
4、GLIB库安装
ftp://sourceware.org/pub/libffi/libffi-3.0.13.tar.gz
# tar xvzf libffi-3.0.13.tar.gz
# cd libffi-3.0.13
# ./configure --prefix=/usr/local
# make && make install
http://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.14.tar.gz
# tar xvzf libiconv-1.14.tar.gz
# cd libiconv-1.14
# ./configure --prefix=/usr/local
# make && make install
http://tukaani.org/xz/xz-5.0.5.tar.gz
# tar xvzf xz-5.0.5.tar.gz
# cd xz-5.0.5
# ./configure --prefix=/usr/local
# make && make install
# /sbin/ldconfig
http://ftp.gnome.org/pub/gnome/sources/glib/2.32/glib-2.32.4.tar.xz
# xz -d glib-2.32.4.tar.xz
# tar -xvf glib-2.32.4.tar
# cd glib-2.32.4
# ./configure --prefix=/usr/local/glib-2.32.4 \
--with-libiconv=/usr/local \
LIBFFI_CFLAGS="-I/usr/local/include" \
LIBFFI_LIBS="-L/usr/local/lib -lffi"
# make && make install
注意:编译报错处理
(1)、configure阶段
# vim ./glib/gconvert.c
注释掉第26、28行的内容
注释掉从61行到67行的内容
# vim ./configure
在7880行之上添加如下内容:
found_iconv=yes
(2)、make阶段
# ln -s /usr/local/lib/libffi-3.0.13/include/ffi.h /usr/local/include
# ln -s /usr/local/lib/libffi-3.0.13/include/ffitarget.h /usr/local/include
glib库需要安装在单独的目录“/usr/local/glib-2.32.4”,也是为了避免与系统自带的glib库发生冲突的可能
5、Atlas安装
(1)、其他基础组件安装
https://github.com/downloads/libevent/libevent/libevent-2.0.21-stable.tar.gz
# tar xvzf libevent-2.0.21-stable.tar.gz
# cd libevent-2.0.21-stable
# ./configure --prefix=/usr/local
# make && make install
http://www.openssl.org/source/openssl-1.0.1h.tar.gz
# tar xvzf openssl-1.0.1h.tar.gz
# cd openssl-1.0.1h
# ./config shared --prefix=/usr/local
# make && make install
(2)、MySQL安装(无需启动)
http://wwwNaNake.org/files/v2.8/cmake-2.8.10.2.tar.gz
# tar -xvzf cmake-2.8.10.2.tar.gz
# cd cmake-2.8.10.2
# ./bootstrap --prefix=/usr/local
# gmake --jobs=`grep processor /proc/cpuinfo | wc -l`
# gmake install
http://downloads.mysql.com/archives/get/file/mysql-5.5.24.tar.gz
# tar -xvzf mysql-5.5.24.tar.gz
# cd mysql-5.5.24
# rm-f CMakeCache.txt
# cmake -DCMAKE_INSTALL_PREFIX=/usr/local/mysql \
-DMYSQL_UNIX_ADDR=/var/run/mysql/mysql.sock \
-DDEFAULT_CHARSET=utf8 \
-DDEFAULT_COLLATION=utf8_general_ci \
-DEXTRA_CHARSETS=all \
-DWITH_MYISAM_STORAGE_ENGINE=1 \
-DWITH_INNOBASE_STORAGE_ENGINE=1 \
-DWITH_READLINE=1 \
-DENABLED_LOCAL_INFILE=1 \
-DWITH_EMBEDDED_SERVER=1 \
-DMYSQL_DATADIR=/data/dbdata/data \
-DMYSQL_TCP_PORT=3306
# make --jobs=`grep processor /proc/cpuinfo | wc -l`
# make install
(3)、DB中间件安装
https://github.com/Qihoo360/Atlas/archive/2.2.1.tar.gz
# tar xvzf Atlas-2.2.1.tar.gz
# cd Atlas-2.2.1
# ./configure --prefix=/usr/local/mysql-proxy \
--with-lua=/usr/local/lua \
--with-mysql=/usr/local/mysql \
GLIB_CFLAGS="-I/usr/local/glib-2.32.4/include/glib-2.0" \
GLIB_LIBS="-L/usr/local/glib-2.32.4/lib/glib-2.0 -lglib-2.0" \
GMODULE_CFLAGS="-I/usr/local/glib-2.32.4/include" \
GMODULE_LIBS="-L/usr/local/glib-2.32.4/lib -lgmodule-2.0" \
GTHREAD_CFLAGS="-I/usr/local/glib-2.32.4/include" \
GTHREAD_LIBS="-L/usr/local/glib-2.32.4/lib -lgthread-2.0" \
LUA_CFLAGS="-I/usr/local/lua/include" \
LUA_LIBS="-L/usr/local/lua/lib -llua-5.1" \
CFLAGS="-DHAVE_LUA_H -O2" \
LDFLAGS="-L/usr/local/lib -L/usr/local/lib64 -lm -ldl -lcrypto"
# make && make install
注意:
编译报错处理
# ln -s /usr/local/glib-2.32.4/lib/glib-2.0/include/glibconfig.h /usr/local/glib-2.32.4/include/glib-2.0
# cd /usr/local
# mv mysql-proxy atlas-2.2.1 && ln -s atlas-2.2.1 mysql-proxy
6、DB中间层配置
(1)、主配置
# vim /usr/local/mysql-proxy/conf/mysql-proxy.cnf
[mysql-proxy]admin-username = sysadmin
admin-password = admin2356!@()
proxy-backend-addresses = 10.222.5.224:3306
proxy-read-only-backend-addresses = 10.240.95.107:3306,10.240.95.108:3306
pwds = health_check1:/iZxz+0GRoA=,health_check2:/iZxz+0GRoA=
daemon = true
keepalive = true
event-threads = 16
log-level = message
log-path = /usr/local/mysql-proxy/log
sql-log = ON
proxy-address = 0.0.0.0:3306
admin-address = 10.209.6.101:3307
charset = utf8
(2)、启动脚本
# vim /etc/init.d/mysql-proxy
#!/bin/sh## mysql-proxy This script starts and stops the mysql-proxy daemon## chkconfig: - 78 30# processname: mysql-proxy# description: mysql-proxy is a proxy daemon to mysql# config: /usr/local/mysql-proxy/conf/mysql-proxy.cnf# pidfile: /usr/local/mysql-proxy/log/mysql-proxy.pid#PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin DAEMON="/usr/local/mysql-proxy/bin/mysql-proxy"CONFIGFILE="/usr/local/mysql-proxy/conf/mysql-proxy.cnf"PIDFILE="/usr/local/mysql-proxy/log/mysql-proxy.pid"LOCKFILE="/var/lock/subsys/mysql-proxy"PROG=`basename $DAEMON` RETVAL=0 start() { echo -n $"Starting ${PROG}......" [ -x $DAEMON ] || exit 5 [ -f $CONFIGFILE ] || exit 6 ${DAEMON} --defaults-file=${CONFIGFILE} || echo -n "${PROG} already running" RETVAL=$? echo [[ $RETVAL -eq 0 ]] && touch $LOCKFILE return $RETVAL} stop() { echo -n $"Stopping ${PROG}......" if [[ `ps aux | grep bin/mysql-proxy | grep -v grep | wc -l` -gt 0 ]]; then kill -TERM `ps -A -oppid,pid,cmd | grep bin/mysql-proxy | grep -v grep | awk '{print $2}'` fi RETVAL=$? echo [[ $RETVAL -eq 0 ]] && rm -f $LOCKFILE $PIDFILE return $RETVAL} restart() { stop sleep 1 start} case "$1" instart) start ;; stop) stop ;; restart) restart ;; condrestart) [[ -e $LOCKFILE ]] && restart ;; *) echo "Usage: $0 {start|stop|restart|condrestart}" RETVAL=1 ;;esac exit $RETVAL
# chmod +x /etc/init.d/mysql-proxy
# chmod 0660 /usr/local/mysql-proxy/conf/mysql-proxy.cnf
# service mysql-proxy start
# ps aux | grep mysql-prox[y]
7、Atlas高可用【Keepalived】环境安装
http://rpm5.org/files/popt/popt-1.14.tar.gz
# tar xvzf popt-1.14.tar.gz
# cd popt-1.14
# ./configure --prefix=/usr/local
# make && make install
http://www.carisma.slowglass.com/~tgr/libnl/files/libnl-3.2.24.tar.gz
# tar xvzf libnl-3.2.24.tar.gz
# cd libnl-3.2.24
# ./configure --prefix=/usr/local
# make && make install
# ln -s /usr/local/include/libnl3/netlink /usr/local/include
# /sbin/ldconfig
http://www.keepalived.org/software/keepalived-1.2.10.tar.gz
# tar xvzf keepalived-1.2.10.tar.gz
# cd keepalived-1.2.10
# ./configure --prefix=/usr/local/keepalived
# make && make install
# ln -s /usr/local/keepalived/sbin/keepalived /usr/sbin
# ln -s /usr/local/keepalived/etc/sysconfig/keepalived /etc/sysconfig
# ln -s /usr/local/keepalived/etc/rc.d/init.d/keepalived /etc/init.d
# chkconfig --add keepalived
8、Atlas高可用【Keepalived】配置
# mkdir �Cp /etc/keepalived /data/scripts
(1)、主节点配置
# vim /etc/keepalived/keepalived.conf
global_defs { notification_email { lovezym5@126.com } notification_email_from lovezym5@126.com smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id dbproxy1} vrrp_script chk_mysql_proxy_health { script "/data/scripts/keepalived_check_mysql_proxy.sh" interval 1 weight -2} vrrp_instance VI_1 { state MASTER interface eth1 virtual_router_id 51 priority 100 advert_int 1 smtp_alert authentication { auth_type PASS auth_pass 123456 } virtual_ipaddress { 10.209.6.115 } track_script { chk_mysql_proxy_health } notify_master "/data/scripts/notify.sh master" notify_bakcup "/data/scripts/notify.sh backup" notify_fault "/data/scripts/notify.sh fault"}
(2)、备用节点配置
# vim /etc/keepalived/keepalived.conf
global_defs { notification_email { lovezym5@126.com } notification_email_from lovezym5@126.com smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id dbproxy2} vrrp_script chk_mysql_proxy_health { script "/data/scripts/keepalived_check_mysql_proxy.sh" interval 1 weight -2} vrrp_instance VI_1 { state BACKUP interface eth1 virtual_router_id 51 priority 90 advert_int 1 smtp_alert authentication { auth_type PASS auth_pass 123456 } virtual_ipaddress { 10.209.6.115 } track_script { chk_mysql_proxy_health } notify_master "/data/scripts/notify.sh master" notify_bakcup "/data/scripts/notify.sh backup" notify_fault "/data/scripts/notify.sh fault"}
(3)、VIP切换通知脚本
# vim /data/scripts/notify.sh
#!/bin/shPATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin:/usr/local/sbin KEEPALIVE_CONF="/etc/keepalived/keepalived.conf" VIP=`grep -A 1 virtual_ipaddress ${KEEPALIVE_CONF} | tail -1 | sed 's/\t//g; s/ //g'`ETH1_ADDR=`/sbin/ifconfig eth1 | awk '/inet addr:/{print $2}' | awk -F: '{print $2}'` MONITOR="/usr/local/oms/agent/alarm/BusMonitorAgent"TOKEN="ha_monitor" function notify() { TITLE="$ETH1_ADDR to be $1: $VIP floating" CONTENT="vrrp transition, $ETH1_ADDR changed to be $1" ${MONITOR} -c 2 -f ${TOKEN} -t "${TITLE}" -i "${CONTENT}"} case "$1" inmaster) notify master exit 0 ;; backup) notify backup exit 0 ;; fault) notify fault exit 0 ;; *) echo 'Usage: `basename $0` {master|backup|fault}' exit 1 ;;esac
(4)、DB中间层进程检查脚本
# vim /data/scripts/keepalived_check_mysql_proxy.sh
#!/bin/shPATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin:/usr/local/sbin if [[ `pgrep mysql-proxy | wc -l` -eq 0 ]]; then /sbin/service mysql-proxy start && sleep 5 [[ -z `pgrep mysql-proxy` ]] && /sbin/service keepalived stopfi
# chmod +x /data/scripts/*.sh
# service keepalived start
# ip addr show eth1
# ps aux | grep keepalive[d]
==========================================================================================
三、其他设置
==========================================================================================
1、Atlas服务监控
# vim /usr/local/mysql-proxy/bin/check_service.sh
#!/bin/shPATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin:/usr/local/sbin [[ $# -ne 3 ]] && echo "$0 端口号 协议类型 服务名" && exit 1 SRV_PORT=$1 ## 端口号SRV_PROT=$2 ## 协议类型SRV_NAME=$3 ## 服务名 MONITOR="/usr/local/oms/agent/alarm/BusMonitorAgent"TOKEN="ha_monitor" TITLE="${SRV_NAME}服务异常监控"CONTENT="${SRV_NAME}服务发生异常,已自动拉起!" ## 是否已正确扫描SCAN_FLAG=0 function RESTART_SRV_AND_ALERT() { local CUR_SRV_NAME [[ $# -ne 1 ]] && exit 1 CUR_SRV_NAME=$1 TMP_SRV_NAME=`echo ${CUR_SRV_NAME} | tr '[A-Z]' '[a-z]'` [[ ! -f /etc/init.d/${TMP_SRV_NAME} ]] && TMP_SRV_NAME="${TMP_SRV_NAME}d" killall -9 ${TMP_SRV_NAME} if [[ -z `ps aux | grep ${TMP_SRV_NAME} | grep -v grep` ]]; then /sbin/service ${TMP_SRV_NAME} start >/dev/null 2>&1 fi ${MONITOR} -c 2 -f ${TOKEN} -t "${TITLE}" -i "${CONTENT}" rm -f `pwd`/connect_error.log} ETH1_ADDR=`/sbin/ifconfig eth1 | awk -F ':' '/inet addr/{print $2}' | sed 's/[a-zA-Z ]//g'`TMP_SRV_PROT=`echo ${SRV_PROT} | tr '[A-Z]' '[a-z]'`if [[ "${TMP_SRV_PROT}" == "tcp" ]]; then PROT_OPT="S"elif [[ "${TMP_SRV_PROT}" == "udp" ]]; then PROT_OPT="U"else echo "未知的协议类型!" && exit 1fi ## 最多扫描3次,成功一次即可,以避免网络抖动而导致误判for ((i=0; i<3; i++)); do RETVAL=`/usr/bin/nmap -n -s${PROT_OPT} -p ${SRV_PORT} ${ETH1_ADDR} | grep open` [[ -n "${RETVAL}" ]] && SCAN_FLAG=1;break || sleep 10done ## 1、针对Atlas服务端口不通的情况,也就是服务彻底挂掉[[ ${SCAN_FLAG} -ne 1 ]] && RESTART_SRV_AND_ALERT ${SRV_NAME} ## 2、检查Atlas服务是否正常工作,也就是服务端口正常,但访问异常的情况【高权限DB用户】mysqladmin -h${ETH1_ADDR} -uhealth_check1 -p123456 --connect-timeout=15 --shutdown-timeout=15 ping[[ $? -ne 0 ]] && RESTART_SRV_AND_ALERT ${SRV_NAME} ## 3、检查Atlas服务是否正常工作,也就是服务端口正常,高权限DB用户访问也正常,但低权限## DB用户访问异常的情况【低权限DB用户】mysqladmin -h${ETH1_ADDR} -uhealth_check2 -p123456 --connect-timeout=15 --shutdown-timeout=15 ping[[ $? -ne 0 ]] && RESTART_SRV_AND_ALERT ${SRV_NAME}
2、Atlas访问日志切割
# vim /data/scripts/cut_and_clear_access_log.sh
#!/bin/sh# 切割Atlas的访问日志,同时清理15天之前的日志#PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/bin:/usr/local/sbin ## mysql-proxy日志路径LOGPATH="/usr/local/mysql-proxy/log" [[ `/sbin/ip addr show eth1 | grep inet | wc -l` -eq 2 ]] || exit 1 cd ${LOGPATH} ## 日志切割HISTORY_LOG_PATH=`date -d '-1 hour' +"%Y-%m-%d/sql_mysql-proxy_%H.log"`[[ -d `dirname ${HISTORY_LOG_PATH}` ]] || mkdir -p `dirname ${HISTORY_LOG_PATH}`cp -a sql_mysql-proxy.log ${HISTORY_LOG_PATH} echo > sql_mysql-proxy.log ## 日志清理HISTORY_LOG_PATH=`date -d '15 days ago' +'%Y-%m-%d'`[[ -d ${HISTORY_LOG_PATH} ]] && rm -rf ${HISTORY_LOG_PATH}
3、crontab内容添加
# touch /var/lock/check_service.lock
# echo 'touch /var/lock/check_service.lock' >> /etc/rc.d/rc.local
# crontab -uroot -e
* * * * * (flock --timeout=0 /var/lock/check_service.lock /usr/local/mysql-proxy/bin/check_service.sh 3306 tcp mysql-proxy >/dev/null 2>&1)00 * * * * /data/scripts/cut_and_clear_access_log.sh >/dev/null 2>&1
4、平滑设置功能
# mysql -h10.209.6.101 -P3307 -usysadmin -p'admin2356!@()'
本文出自 “人生理想在于坚持不懈” 博客,请务必保留此出处http://sofar.blog.51cto.com/353572/1601552