本文是基于CentOS 7及Openstack juno版本的高可用实践。
高可用组件安装配置
准备工作
首先确保两台机器时间同步,配置ssh。
安装组件
添加yum源,这个源包含了crmsh、resource-agents等包:
[haclustering] name=HA Clustering baseurl=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/ enabled=1 gpgcheck=0
安装相关组件:
#yum install pacemaker corosync resource-agents crmsh pcs
安装DRBD
方法一通过yum源安装:
# rpm --import http://elrepo.org/RPM-GPG-KEY-elrepo.org # rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm # yum -y install drbd84-utils kmod-drbd84方法二编译安装:
#yum install docbook-style-xsl #编译drbd时候用到 #mkdir -p /tmp/drbdinst #/usr/bin/wget --directory-prefix=/tmp/drbdinst/ http://oss.linbit.com/drbd/8.4/drbd-8.4.5.tar.gz #cd /tmp/drbdinst #tar -zxfp drbd-8.4.5.tar.gz #cd drbd-8.4.5 #/usr/bin/yum -y install flex gcc make #make #make install #/usr/bin/yum -y install libxslt #/usr/bin/wget --directory-prefix=/tmp/drbdinst/ http://oss.linbit.com/drbd/drbd-utils-8.9.1.tar.gz #cd /tmp/drbdinst #tar -zxfp drbd-utils-8.9.1.tar.gz #cd drbd-utils-8.9.1 #./configure --prefix=/usr --localstatedir=/var --sysconfdir=/etc #make #make install #cp /lib/udev/65-drbd.rules /lib/udev/rules.d/ #/bin/rm -rf /tmp/drbdinst
配置
Corosync配置
从/etc/corosync/corosync.conf.example.udpu复制sample文件到/etc/corosync/corosync.conf,根据实际配置编辑:
compatibility: whitetank service { ver: 1 name: pacemaker use_logd: yes } logging { fileline: off to_logfile: yes logfile: /var/log/cluster/corosync.log to_stderr: no debug: off timestamp: on to_syslog: yes logger_subsys { subsys: QUORUM debug: off } } totem { version: 2 token: 3000 secauth: on rrp_mode: active interface { ringnumber: 0 bindnetaddr: 10.0.0.0 mcastaddr: 226.94.1.1 mcastport: 5405 } } quorum { provider: corosync_votequorum expected_votes: 2 }
如果secauth设置为on,则需要生成一个加密key用于集群通信:
# corosync-keygen
生成完成后,复制到其他节点:
# scp -p /etc/corosync/authkey controllerv:/etc/corosync/
在所有节点上启动服务:
# systemctl start corosync pacemaker
查看配置、membership及quorum API
# corosync-cfgtool -s
# corosync-cmapctl | grep members
# corosync-quorumtool -l 或pcs status corosync
Pacemaker配置
首先验证pacemaker安装:
# ps axf 49091 ? Ss 0:00 /usr/sbin/pacemakerd -f 49092 ? Ss 0:00 \_ /usr/libexec/pacemaker/cib 49093 ? Ss 0:00 \_ /usr/libexec/pacemaker/stonithd 49094 ? Ss 0:00 \_ /usr/libexec/pacemaker/lrmd 49095 ? Ss 0:00 \_ /usr/libexec/pacemaker/attrd 49096 ? Ss 0:00 \_ /usr/libexec/pacemaker/pengine 49097 ? Ss 0:00 \_ /usr/libexec/pacemaker/crmd
检查集群状态:
# crm status Last updated: Tue Dec 2 23:04:29 2014 Last change: Tue Dec 2 22:54:01 2014 via crmd on node1 Stack: corosync Current DC: NONE 2 Nodes configured 0 Resources configured Online: [ controller controllerv ]
查看配置:
# crm configure show node 167772171: controller node 167772172: controllerv property cib-bootstrap-options: \ dc-version=1.1.10-32.el7_0.1-368c726 \ cluster-infrastructure=corosync
cibadmin --query --local或pcs cluster cib查看xml格式的cib信息。
如下命令验证配置,可以发现问题:
# crm_verify -L -V error: unpack_resources: Resource start-up disabled since no STONITH resources have been defined error: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option error: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity Errors found during check: config not valid
因为测试没有STONITH设备,所以禁用STONITH:(STONITH会在另一篇不可做详细介绍)
# crm configure property stonith-enabled=false
双节点忽略quorum:
# crm configure property no-quorum-policy=ignore
修改后查看配置并验证:
# crm configure show node 167772171: controller node 167772172: controllerv property cib-bootstrap-options: \ dc-version=1.1.10-32.el7_0.1-368c726 \ cluster-infrastructure=corosync \ stonith-enabled=false \ no-quorum-policy=ignore # crm_verify -L #
配置DRBD
MARIADB
MySQL/DRBD/Pacemaker/Corosync Stack
编辑/etc/drbd.conf
# You can find an example in /usr/share/doc/drbd.../drbd.conf.example include "drbd.d/global_common.conf"; include "drbd.d/*.res";使用global_common.conf,后面还需要配置rabbitmq。
global { usage-count no; } common { protocol C; # C 同步 A 异步 B 半同步 handlers { pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f"; } startup { wfc-timeout 30; #DRBD资源连接 degr-wfc-timeout 30; #节点在degraded cluster } disk { on-io-error detach; fencing resource-only; } net { #通信使用的信息算法 cram-hmac-alg "sha1"; shared-secret "mydrbd"; } syncer { rate 100M; #设置同步的最大网速 } }
新建/etc/drbd.d/mariadb.res
resource mariadb { device /dev/drbd0; disk /dev/mapper/data_vg-mariadb; meta-disk internal; #此处根据是情况设置,下面有具体说明 on controller { address 10.0.0.11:7789; } on controllerv { address 10.0.0.12:7789; } }注意:关于DRBD元数据的一些注意事项请参考我另一篇博客。
因为我的数据已经存在了,所以防止数据丢失,先进行备份。
# dd if=/dev/data_vg/mariadb of=/root/back bs=1M count=150
在主节点上,先将对应文件系统umount,然后进行元数据设置。
# drbdadm create-md mariadb md_offset 1077932032 al_offset 1077899264 bm_offset 1077862400 Found ext2 filesystem #这里有点怪,我使用的ext4,却显示ext2 1048576 kB data area apparently used 1052600 kB left usable by current configuration Even though it looks like this would place the new meta data into unused space, you still need to confirm, as this is only a guess. Do you want to proceed? [need to type 'yes' to confirm] yes initializing activity log NOT initializing bitmap Writing meta data... New drbd meta data block successfully created. # drbdadm up mariadb # cat /proc/drbd version: 8.4.5 (api:1/proto:86-101) GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by mockbuild@, 2014-08-17 22:54:26 0: cs:WFConnection ro:Secondary/Unknown ds:Inconsistent/DUnknown C r----s ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:1052600
节点2上(事先已经建立了和主节点一样的LV和文件系统):
# drbdadm create-md mariadb initializing activity log NOT initializing bitmap Writing meta data... New drbd meta data block successfully created. # drbdadm up mariadb # cat /proc/drbd version: 8.4.5 (api:1/proto:86-101) GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by mockbuild@, 2014-08-17 22:54:26 0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r----- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:1052600在主节点上,执行:
# drbdadm -- --force primary mariadb # cat /proc/drbd version: 8.4.5 (api:1/proto:86-101) GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by mockbuild@, 2014-08-17 22:54:26 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r----- ns:0 nr:0 dw:0 dr:728 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
在两个节点上都确保建立正确的挂载点,在主节点上挂载drbd,确认之前的数据还在。
# mount /dev/drbd0 /data/mariadb/
确认OK后,DRBD文件系统应该umount(DRBD主控释放资源)
# umount /dev/drbd0
# drbdadm secondary mariadb
RABBITMQ
新建/etc/drbd.d/rabbitmq.res
resource rabbitmq { device /dev/drbd1; disk /dev/data_vg/rabbitmq; meta-disk internal; on controller { address 10.0.0.11:7790; } on controllerv { address 10.0.0.12:7790; } }在两个节点上建立两个大小相同的LV,同时设置DRBD元数据。
# drbdadm create-md rabbitmq initializing activity log NOT initializing bitmap Writing meta data... New drbd meta data block successfully created.两个节点上attach DRBD:
# drbdadm up rabbitmq
在主节点执行如下命令,可以看到有两个对象了。
# drbdadm -- --force primary rabbitmq # cat /proc/drbd version: 8.4.5 (api:1/proto:86-101) GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by mockbuild@, 2014-08-17 22:54:26 0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r----- ns:4112 nr:8 dw:24 dr:6722 al:1 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 1: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----- ns:557020 nr:0 dw:0 dr:559832 al:0 bm:0 lo:0 pe:1 ua:3 ap:0 ep:1 wo:f oos:492476 [=========>..........] sync'ed: 53.2% (492476/1048508)K finish: 0:00:12 speed: 39,716 (39,716) K/sec创建文件系统,推荐使用xfs:
# mkfs -t xfs /dev/drbd1
然后和mariadb一样,变为secondary role:
# drbdadm secondary rabbitmq
准备工作
确保所有节点的RABBITMQ的.erlang.cookie文件完全一样。
# scp -p /var/lib/rabbitmq/.erlang.cookie controllerv:/var/lib/rabbitmq/
同时将这个文件复制到DRBD-backend文件系统:
# cp -a /var/lib/rabbitmq/.erlang.cookie /mnt
# umount /mnt
集群资源配置
MARIADB
定义一个资源:
# crm configure crm(live)configure# primitive p_drbd_mariadb ocf:linbit:drbd params drbd_resource="mariadb" op monitor interval=15s关于上面ocf:linbit:drbd说明:
# pcs resource standards 显示可用资源标准 ocf lsb service systemd stonith # pcs resource providers 显示可用的ocf资源提供者 heartbeat linbit pacemaker rabbitmq # pcs resource agents ocf:linbit 显示可用的资源agent drbd
定义一个主从资源ms_drbd_mariadb:
crm(live)configure# ms ms_drbd_mariadb p_drbd_mariadb meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
创建文件系统、VIP资源,然后创建service后面接命令行选项,组成资源组。(由于是centos7 mariadb所以使用systemd)
crm(live)configure# primitive p_fs_mariadb ocf:heartbeat:Filesystem params device="/dev/drbd0" directory="/data/mariadb/dbdata" fstype="ext4" crm(live)configure# primitive p_ip_mariadb ocf:heartbeat:IPaddr2 params ip="10.0.0.10" cidr_netmask="24" nic="eth0" crm(live)configure# primitive p_mariadb systemd:mariadb params op start timeout=120s op stop timeout=120s op monitor interval=20s timeout=30s crm(live)configure# group g_mariadb p_fs_mariadb p_ip_mariadb p_mariadb而Mariadb服务(组)需要运行在DRBD主节点,通过定义代管和时序约束(ordering constraint)来确认Mariadb
crm(live)configure# colocation c_myariadb_on_drbd inf: g_mariadb ms_drbd_mariadb:Master crm(live)configure# order o_drbd_before_mariadb inf: ms_drbd_mariadb:promote g_mariadb:start crm(live)configure# commit WARNING: p_fs_mariadb: default timeout 20s for start is smaller than the advised 60 WARNING: p_fs_mariadb: default timeout 20s for stop is smaller than the advised 60 WARNING: p_drbd_mariadb: default timeout 20s for start is smaller than the advised 240 WARNING: p_drbd_mariadb: default timeout 20s for stop is smaller than the advised 100 WARNING: p_drbd_mariadb: action monitor not advertised in meta-data, it may not be supported by the RA通过如下命令进行验证:
# crm_mon -1 Last updated: Sat Dec 6 19:18:36 2014 Last change: Sat Dec 6 19:17:38 2014 via cibadmin on controller Stack: corosync Current DC: controller (167772171) - partition with quorum Version: 1.1.10-32.el7_0.1-368c726 2 Nodes configured 5 Resources configured Online: [ controller controllerv ] Master/Slave Set: ms_drbd_mariadb [p_drbd_mariadb] Masters: [ controller ] Slaves: [ controllerv ] Resource Group: g_mariadb p_fs_mariadb (ocf::heartbeat:Filesystem): Started controller p_ip_mariadb (ocf::heartbeat:IPaddr2): Started controller p_mariadb (systemd:mariadb): Started controller
RABBITMQ
配置rabbitmq资源:
# crm configure crm(live)configure# primitive p_drbd_rabbitmq ocf:linbit:drbd params drbd_resource="rabbitmq" op start timeout="90s" op stop timeout="180s" \ op promote timeout="180s" op demote timeout="180s" op monitor interval="30s" role="Slave" op monitor interval="29s" role="Master" crm(live)configure# ms ms_drbd_rabbitmq p_drbd_rabbitmq meta notify="true" master-max="1" clone-max="2" crm(live)configure# primitive p_ip_rabbitmq ocf:heartbeat:IPaddr2 params ip="10.0.0.9" cidr_netmask="24" op monitor interval="10s" crm(live)configure# primitive p_fs_rabbitmq ocf:heartbeat:Filesystem params device="/dev/drbd1" directory="/var/lib/rabbitmq" fstype="xfs" \ options="relatime" op start timeout="60s" op stop timeout="180s" op monitor interval="60s" timeout="60s" crm(live)configure# primitive p_rabbitmq ocf:rabbitmq:rabbitmq-server params nodename="rabbit@localhost" mnesia_base="/var/lib/rabbitmq" \ op monitor interval="20s" timeout="10s" crm(live)configure# group g_rabbitmq p_ip_rabbitmq p_fs_rabbitmq p_rabbitmq crm(live)configure# colocation c_rabbitmq_on_drbd inf: g_rabbitmq ms_drbd_rabbitmq:Master crm(live)configure# order o_drbd_before_rabbitmq inf: ms_drbd_rabbitmq:promote g_rabbitmq:start crm(live)configure# verify WARNING: p_drbd_mariadb: default timeout 20s for start is smaller than the advised 240 WARNING: p_drbd_mariadb: default timeout 20s for stop is smaller than the advised 100 WARNING: p_drbd_mariadb: action monitor not advertised in meta-data, it may not be supported by the RA WARNING: p_fs_mariadb: default timeout 20s for start is smaller than the advised 60 WARNING: p_fs_mariadb: default timeout 20s for stop is smaller than the advised 60 WARNING: p_drbd_rabbitmq: specified timeout 90s for start is smaller than the advised 240 WARNING: p_rabbitmq: default timeout 20s for start is smaller than the advised 600 WARNING: p_rabbitmq: default timeout 20s for stop is smaller than the advised 120 WARNING: p_rabbitmq: specified timeout 10s for monitor is smaller than the advised 20 crm(live)configure# commit WARNING: p_drbd_rabbitmq: specified timeout 90s for start is smaller than the advised 240 WARNING: p_rabbitmq: default timeout 20s for start is smaller than the advised 600 WARNING: p_rabbitmq: default timeout 20s for stop is smaller than the advised 120 WARNING: p_rabbitmq: specified timeout 10s for monitor is smaller than the advised 20 crm(live)configure#使用crm_mon -1检查状态。
配置Openstack API
这里就只说明下keystone、glance、neutron及nova的配置方法,其他组件的配置方式类似。
配置VIP
# crm configure primitive p_ip_osapi ocf:heartbeat:IPaddr2 params ip="10.0.0.8" cidr_netmask="24" op monitor interval="30s"
keystone设置
cd /usr/lib/ocf/resource.d/ # mkdir openstack # cd openstack # wget https://raw.github.com/madkiss/openstack-resource-agents/master/ocf/keystone # ls -l total 12 -rw-r--r--. 1 root root 10993 Dec 6 23:10 keystone [root@controller openstack]# chmod a+rx *需要将文件复制到从节点。
添加资源
# crm configure primitive p_keystone ocf:openstack:keystone params config="/etc/keystone/keystone.conf" os_password="admin" \
os_username="admin" os_tenant_name="admin" os_auth_url="http://10.0.0.8:5000/v2.0/" op monitor interval="30s" timeout="30s"
确保所有数据存储在数据库中:
# openstack-config --set /etc/keystone/keystone.conf catalog driver keystone.catalog.backends.sql.Catalog
# openstack-config --set /etc/keystone/keystone.conf identify driver keystone.identity.backends.sql.Identity
修改完成后,修改其他服务的配置文件中keystone部分,包括dashboard的local_setting文件OPENSTACK_HOST。
glance api设置
# wget https://raw.github.com/madkiss/openstack-resource-agents/master/ocf/glance-api # wget https://raw.github.com/madkiss/openstack-resource-agents/master/ocf/glance-registry # chmod a+rx glance-api编辑/etc/glance/glance-api.conf
connection=mysql://glance:glance@10.0.0.10/glance bind_host = 10.0.0.8 registry_host = 10.0.0.8 notifier_strategy = rabbit rabbit_host = 10.0.0.9
编辑/etc//etc/glance/glance-api.conf,bind_host = 10.0.0.8
添加资源
# crm configure primitive p_glance-api ocf:openstack:glance-api params config="/etc/glance/glance-api.conf" os_password="admin" \os_username="admin" os_tenant_name="admin" os_auth_url="http://10.0.0.8:5000/v2.0/" op monitor interval="30s" timeout="30s"
# crm configure primitive p_glance-registry ocf:openstack:glance-registry op monitor interval="30s" timeout="30s"
nova设置
如上使用wget或git从对应网址下载脚本,nova-conductor从https://github.com/tcatyb/vagrant_openstack_mysql_drbd_pacemaker下载,要自己修改修改才能用。
编辑/etc/nova/nova.conf
rabbit_host = 10.0.0.9 my_ip = 10.0.0.8 vncserver_listen = 10.0.0.8 vncserver_proxyclient_address = 10.0.0.8 connection=mysql://nova:nova@10.0.0.10/nova添加资源
# crm configure primitive p_nova-api ocf:openstack:nova-api params config="/etc/nova/nova.conf" os_password="admin" os_username="admin" \ os_tenant_name="admin" keystone_get_token_url="http://10.0.0.8:5000/v2.0/tokens" op monitor interval="30s" timeout="30s" # crm configure primitive p_nova-cert ocf:openstack:nova-cert op monitor interval="30s" timeout="30s" # crm configure primitive p_nova-conductor ocf:openstack:nova-conductor op monitor interval="30s" timeout="30s" # crm configure primitive p_nova-consoleauth ocf:openstack:nova-consoleauth op monitor interval="30s" timeout="30s" # crm configure primitive p_nova-novncproxy ocf:openstack:nova-novnc op monitor interval="30s" timeout="30s" # crm configure primitive p_nova-scheduler ocf:openstack:nova-scheduler op monitor interval="30s" timeout="30s"
neutron-server设置
# wget https://raw.github.com/madkiss/openstack-resource-agents/master/ocf/neutron-server # chmod a+rx neutron-server编辑 /etc/neutron/neutron.conf
bind_host = 10.0.0.8 notifier_strategy = rabbit rabbit_host = 10.0.0.9 [database] connection = mysql://neutron:neutron@10.0.0.10/neutron
编辑nova.conf,修改[neutron] url=http://10.0.0.8:9696。同时修改刚才neutron-server文件中的相关内容。
添加资源
# crm configure primitive p_neutron-server ocf:openstack:neutron-server params os_password="admin" os_username="admin" os_tenant_name="admin" \
keystone_get_token_url="http://10.0.0.8:5000/v2.0/tokens" op monitor interval="30s" timeout="30s"
配置分组
有一定的先后关系,nova的相关服务单独拉出来做比较好。
# crm configure group g_services_api p_ip_osapi p_keystone p_glance-registry p_glance-api p_neutron-server p_nova-api p_nova-cert p_nova-consoleauth p_nova-scheduler p_nova-conductor p_nova-novncproxy
Neutron Agent
# cd /usr/lib/ocf/resource.d/openstack # wget https://raw.github.com/madkiss/openstack-resource-agents/master/ocf/neutron-agent-l3 # wget https://raw.github.com/madkiss/openstack-resource-agents/master/ocf/neutron-agent-dhcp # wget https://raw.github.com/madkiss/openstack-resource-agents/master/ocf/neutron-metadata-agent # chmod a+rx neutron-*添加资源及分组
# crm configure crm(live)configure# primitive p_neutron-l3-agent ocf:openstack:neutron-agent-l3 params config="/etc/neutron/neutron.conf" \ > plugin_config="/etc/neutron/l3_agent.ini" op monitor interval="30s" timeout="30s" crm(live)configure# primitive p_neutron-dhcp-agent ocf:openstack:neutron-agent-dhcp params config="/etc/neutron/neutron.conf" \ > plugin_config="/etc/neutron/dhcp_agent.ini" op monitor interval="30s" timeout="30s" crm(live)configure# primitive p_neutron-metadata-agent ocf:openstack:neutron-metadata-agent params config="/etc/neutron/neutron.conf" \ > agent_config="/etc/neutron/metadata_agent.ini" op monitor interval="30s" timeout="30s" crm(live)configure# group g_services_neutron p_neutron-l3-agent p_neutron-dhcp-agent p_neutron-metadata-agent crm(live)configure# commit
Openvswitch Agent
分别在两个控制节点上启动neutron-openvswitch-agent服务,启动前确认plugin.ini文件中local_ip为正确地址。
验证
首先,创建一个实例:
# nova boot --flavor m1.tiny --image cirros-0.3.3-x86_64 --nic net-id=afe44e34-bb35-4268-bacd-01670bef984c --security-group default --key-name demo-key demo-instance1
等实例创建成功后,按照openstack进行类似的安装验证。
下面我们来试验下failover。
在主节点上执行:
# ifdown eth0