一、DRBD配置
Distributed Replicated Block Device(DRBD)是一个用软件实现的、无共享的、服务器之间镜像块设备内容的存储复制解决方案。
我们可以理解为它其实就是个网络Raid 1,两台服务器间就算某台因断电或宕机也不会对数据有任何影响,而真正的热切换可以通过Heartbeat方案解决,不需要人工干预。
二、环境描述
系统版本:centos6.6 x64(内核2.6.32-504.16.2.el6.x86_64)
DRBD版本:DRBD-8.4.3
node1(主节点)IP: 192.168.0.191 主机名:drbd1.corp.com
node2(从节点)IP: 192.168.0.192 主机名:drbd2.corp.com
(node1) 仅为主节点配置
(node2) 仅为从节点配置
(node1,node2) 为主从节点共同配置
三、安装前准备:(node1,node2)
1、关闭iptables和SELINUX,避免安装过程中报错。
1
2
3
4
5
6
7
|
# service iptables stop
# chkconfig iptables off
# setenforce 0
# vi /etc/selinux/config
---------------
SELINUX=disabled
---------------
|
2、配置hosts文件
1
2
3
|
# vi /etc/hosts
192.168.0.191 drbd1.corp.com
192.168.0.192 drbd2.corp.com
|
3、在两台虚拟机分别添加一块10G硬盘分区作为DRBD设备磁盘,分别都为sdb1,大小10G,并在本地系统创建/store目录,不做挂载操作。
1
2
3
4
5
|
# fdisk /dev/sdb
----------------
n-p-1-1-"+10G"-w
----------------
# mkdir /store
|
4、时间同步:
1
|
# ntpdate -u asia.pool.ntp.org
|
四、DRBD的安装配置:
1、安装依赖包:(node1,node2)
1
|
# yum install gcc gcc-c++ make glibc flex kernel-devel kernel-headers
|
2、安装DRBD:(node1,node2)
1
2
3
4
5
6
7
8
9
10
|
# wget http://oss.linbit.com/drbd/8.4/drbd-8.4.3.tar.gz
# tar zxvf drbd-8.4.3.tar.gz
# cd drbd-8.4.3
# ./configure --prefix=/usr/local/drbd --with-km
# make KDIR=/usr/src/kernels/2.6.32-504.16.2.el6.x86_64/
# make install
# mkdir -p /usr/local/drbd/var/run/drbd
# cp /usr/local/drbd/etc/rc.d/init.d/drbd /etc/rc.d/init.d
# chkconfig --add drbd
# chkconfig drbd on
|
3、加载DRBD模块:(node1,node2)
查看DRBD模块是否加载到内核:
1
2
3
|
# lsmod |grep drbd
drbd 310172 4
libcrc32c 1246 1 drbd
|
4、参数配置:(node1,node2)
1
|
# vi /usr/local/drbd/etc/drbd.conf
|
清空文件内容,并添加如下配置:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
|
resource r0{
protocol C;
startup { wfc-timeout 0; degr-wfc-timeout 120;}
disk { on-io-error detach;}
net{
timeout 60;
connect-int 10;
ping-int 10;
max-buffers 2048;
max-epoch-size 2048;
}
syncer { rate 200M;}
on drbd1.corp.com{
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.0.191:7788;
meta-disk internal;
}
on drbd2.corp.com{
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.0.192:7788;
meta-disk internal;
}
}
|
注:请修改上面配置中的主机名、IP、和disk为自己的具体配置
5、创建DRBD设备并激活r0资源:(node1,node2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
|
# mknod /dev/drbd0 b 147 0
# drbdadm create-md r0
等待片刻,显示success表示drbd块创建成功
Writing meta data...
initializing activity log
NOT initializing bitmap
New drbd meta data block successfully created.
--== Creating metadata ==--
As with nodes, we count the total number of devices mirrored by DRBD
at http://usage.drbd.org.
The counter works anonymously. It creates a random number to identify
the device and sends that random number, along with the kernel and
DRBD version, to usage.drbd.org.
http://usage.drbd.org/cgi-bin/insert_usage.pl?
nu=716310175600466686&ru=15741444353112217792&rs=1085704704
* If you wish to opt out entirely, simply enter 'no'.
* To continue, just press [RETURN]
success
|
再次输入该命令:
1
2
3
4
5
6
7
8
|
# drbdadm create-md r0
成功激活r0
[need to type 'yes' to confirm] yes
Writing meta data...
initializing activity log
NOT initializing bitmap
New drbd meta data block successfully created.
|
6、启动DRBD服务:(node1,node2)
注:需要主从共同启动方能生效
7、查看状态:(node1,node2)
1
2
3
4
5
6
|
# service drbd status
drbd driver loaded OK; device status:
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@drbd1.corp.com, 2015-05-12 21:05:41
m:res cs ro ds p mounted fstype
0:r0 Connected Secondary/Secondary Inconsistent/Inconsistent C
|
这里ro:Secondary/Secondary表示两台主机的状态都是备机状态,ds是磁盘状态,显示的状态内容为“Inconsistent不一致”,这是因为DRBD无法判断哪一方为主机,应以哪一方的磁盘数据作为标准。
8、将drbd1.example.com主机配置为主节点:(node1)
1
|
# drbdsetup /dev/drbd0 primary --force
|
分别查看主从DRBD状态:
(node1)
1
2
3
4
5
6
|
# service drbd status
drbd driver loaded OK; device status:
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@drbd1.corp.com, 2015-05-12 21:05:41
m:res cs ro ds p mounted fstype
0:r0 Connected Primary/Secondary UpToDate/UpToDate C
|
(node2)
1
2
3
4
5
6
|
# service drbd status
drbd driver loaded OK; device status:
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@drbd2.corp.com, 2015-05-12 21:05:46
m:res cs ro ds p mounted fstype
0:r0 Connected Secondary/Primary UpToDate/UpToDate C
|
ro在主从服务器上分别显示 Primary/Secondary和Secondary/Primary
ds显示UpToDate/UpToDate
表示主从配置成功。
9、挂载DRBD:(node1)
从刚才的状态上看到mounted和fstype参数为空,所以我们这步开始挂载DRBD到系统目录/store
1
2
|
# mkfs.ext4 /dev/drbd0
# mount /dev/drbd0 /store
|
注:Secondary节点上不允许对DRBD设备进行任何操作,包括挂载;所有的读写操作只能在Primary节点上进行,只有当Primary节点挂掉时,Secondary节点才能提升为Primary节点,并自动挂载DRBD继续工作。
成功挂载后的DRBD状态:(node1)
1
2
3
4
5
6
|
# service drbd status
drbd driver loaded OK; device status:
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@drbd1.corp.com, 2015-05-12 21:05:41
m:res cs ro ds p mounted fstype
0:r0 Connected Primary/Secondary UpToDate/UpToDate C /store ext4
|
五、HeartBeat+NFS配置
1、安装heartbeat
1
2
|
# yum install epel-release -y
# yum --enablerepo=epel install heartbeat -y
|
2、设置heartbeat配置文件
(node1)
编辑ha.cf,添加下面配置:
1
2
3
4
5
6
7
8
|
# vi /etc/ha.d/ha.cf
logfile /var/log/ha-log
logfacility local0
keepalive 2
deadtime 5
ucast eth0 192.168.0.192 # 指定对方网卡及IP
auto_failback off
node drbd1.corp.com drbd2.corp.com
|
(node2)
编辑ha.cf,添加下面配置:
1
2
3
4
5
6
7
8
|
# vi /etc/ha.d/ha.cf
logfile /var/log/ha-log
logfacility local0
keepalive 2
deadtime 5
ucast eth0 192.168.0.191
auto_failback off
node drbd1.corp.com drbd2.corp.com
|
3、编辑双机互联验证文件authkeys,添加以下内容:(node1,node2)
1
2
3
|
# vi /etc/ha.d/authkeys
auth 1
1 crc
|
给验证文件600权限
1
|
# chmod 600 /etc/ha.d/authkeys
|
4、编辑集群资源文件:(node1,node2)
1
2
|
# vi /etc/ha.d/haresources
drbd1.corp.com IPaddr::192.168.0.190/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0::/store::ext4 killnfsd
|
注:该文件内IPaddr,Filesystem等脚本存放路径在/etc/ha.d/resource.d/下,也可在该目录下存放服务启动脚本(例如:mysql,www),将相同脚本名称添加到/etc/ha.d/haresources内容中,从而跟随heartbeat启动而启动该脚本。
IPaddr::192.168.0.190/24/eth0:用IPaddr脚本配置对外服务的浮动虚拟IP
drbddisk::r0:用drbddisk脚本实现DRBD主从节点资源组的挂载和卸载
Filesystem::/dev/drbd0::/store::ext4:用Filesystem脚本实现磁盘挂载和卸载
5、编辑脚本文件killnfsd,用来重启NFS服务:(node1,node2)
1
2
|
# vi /etc/ha.d/resource.d/killnfsd
killall -9 nfsd; /etc/init.d/nfs restart;exit 0
|
赋予755执行权限:
1
|
# chmod 755 /etc/ha.d/resource.d/killnfsd
|
六、创建DRBD脚本文件drbddisk:(node1,node2)
编辑drbddisk,添加下面的脚本内容
1
|
# vi /etc/ha.d/resource.d/drbddisk
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
|
#!/bin/bash
#
# This script is inteded to be used as resource script by heartbeat
#
# Copright 2003-2008 LINBIT Information Technologies
# Philipp Reisner, Lars Ellenberg
#
###
DEFAULTFILE="/etc/default/drbd"
DRBDADM="/sbin/drbdadm"
if [ -f $DEFAULTFILE ]; then
. $DEFAULTFILE
fi
if [ "$#" -eq 2 ]; then
RES="$1"
CMD="$2"
else
RES="all"
CMD="$1"
fi
## EXIT CODES
# since this is a "legacy heartbeat R1 resource agent" script,
# exit codes actually do not matter that much as long as we conform to
# http://wiki.linux-ha.org/HeartbeatResourceAgent
# but it does not hurt to conform to lsb init-script exit codes,
# where we can.
# http://refspecs.linux-foundation.org/LSB_3.1.0/
#LSB-Core-generic/LSB-Core-generic/iniscrptact.html
####
drbd_set_role_from_proc_drbd()
{
local out
if ! test -e /proc/drbd; then
ROLE="Unconfigured"
return
fi
dev=$( $DRBDADM sh-dev $RES )
minor=${dev#/dev/drbd}
if [[ $minor = *[!0-9]* ]] ; then
# sh-minor is only supported since drbd 8.3.1
minor=$( $DRBDADM sh-minor $RES )
fi
if [[ -z $minor ]] || [[ $minor = *[!0-9]* ]] ; then
ROLE=Unknown
return
fi
if out=$(sed -ne "/^ *$minor: cs:/ { s/:/ /g; p; q; }" /proc/drbd); then
set -- $out
ROLE=${5%/**}
: ${ROLE:=Unconfigured} # if it does not show up
else
ROLE=Unknown
fi
}
case "$CMD" in
start)
# try several times, in case heartbeat deadtime
# was smaller than drbd ping time
try=6
while true; do
$DRBDADM primary $RES && break
let "--try" || exit 1 # LSB generic error
sleep 1
done
;;
stop)
# heartbeat (haresources mode) will retry failed stop
# for a number of times in addition to this internal retry.
try=3
while true; do
$DRBDADM secondary $RES && break
# We used to lie here, and pretend success for anything != 11,
# to avoid the reboot on failed stop recovery for "simple
# config errors" and such. But that is incorrect.
# Don't lie to your cluster manager.
# And don't do config errors...
let --try || exit 1 # LSB generic error
sleep 1
done
;;
status)
if [ "$RES" = "all" ]; then
echo "A resource name is required for status inquiries."
exit 10
fi
ST=$( $DRBDADM role $RES )
ROLE=${ST%/**}
case $ROLE in
Primary|Secondary|Unconfigured)
# expected
;;
*)
# unexpected. whatever...
# If we are unsure about the state of a resource, we need to
# report it as possibly running, so heartbeat can, after failed
# stop, do a recovery by reboot.
# drbdsetup may fail for obscure reasons, e.g. if /var/lock/ is
# suddenly readonly. So we retry by parsing /proc/drbd.
drbd_set_role_from_proc_drbd
esac
case $ROLE in
Primary)
echo "running (Primary)"
exit 0 # LSB status "service is OK"
;;
Secondary|Unconfigured)
echo "stopped ($ROLE)"
exit 3 # LSB status "service is not running"
;;
*)
# NOTE the "running" in below message.
# this is a "heartbeat" resource script,
# the exit code is _ignored_.
echo "cannot determine status, may be running ($ROLE)"
exit 4 # LSB status "service status is unknown"
;;
esac
;;
*)
echo "Usage: drbddisk [resource] {start|stop|status}"
exit 1
;;
esac
exit 0
|
赋予755执行权限:
1
|
# chmod 755 /etc/ha.d/resource.d/drbddisk
|
三、启动HeartBeat服务
在两个节点上启动HeartBeat服务,先启动node1:(node1,node2)
1
2
|
# service heartbeat start
# chkconfig heartbeat on
|
现在从其他机器能够ping通虚IP 192.168.0.190,表示配置成功
七、配置NFS:(node1,node2)
编辑exports配置文件,添加以下配置:
1
2
|
# vi /etc/exports
/store *(rw,no_root_squash)
|
重启NFS服务:
1
2
3
4
|
# service rpcbind restart
# service nfs restart
# chkconfig rpcbind on
# chkconfig nfs off
|
注:这里设置NFS开机不要自动运行,因为/etc/ha.d/resource.d/killnfsd 该脚本会控制NFS的启动。
八、测试高可用
1、正常热备切换
在客户端挂载NFS共享目录
1
|
# mount -t nfs 192.168.0.190:/store /tmp
|
模拟将主节点node1 的heartbeat服务停止,则备节点node2会立即无缝接管;测试客户端挂载的NFS共享读写正常。
此时备机node2上的DRBD状态:
1
2
3
4
5
6
|
# service drbd status
drbd driver loaded OK; device status:
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@drbd2.corp.com, 2015-05-12 21:05:41
m:res cs ro ds p mounted fstype
0:r0 Connected Primary/Secondary UpToDate/UpToDate C /store ext4
|
2、异常宕机切换
强制关机,直接关闭node1电源
node2节点也会立即无缝接管,测试客户端挂载的NFS共享读写正常。
此时node2上的DRBD状态:
1
2
3
4
5
6
|
# service drbd status
drbd driver loaded OK; device status:
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by root@drbd2.corp.com, 2015-05-12 21:05:41
m:res cs ro ds p mounted fstype
0:r0 Connected Primary/Unknown UpToDate/DUnknown C /store ext4
|
九、配置DRBD常见错误总结
问题1、’ha’ ignored, since this host (node2.centos.bz) is not mentioned with an ‘on’ keyword.?
错误信息:
执行指令 drbdadm create-md ha 时出现如下错误信息:
'ha' ignored, since this host (node2.centos.bz) is not mentioned with an 'on' keyword.
解决方法:
因为在 drbd 设定 drbd.conf 中 on 本来写的是 node1、node2 而以,将node1和node2分别改为node1.centos.bz,node2.centos.bz。
问题2、drbdadm create-md ha: exited with coolpre 20?
错误信息:
执行指令 drbdadm create-md ha 时出现如下错误信息:
open(/dev/hdb1) failed: No such file or directory
Command 'drbdmeta 0 v08 /dev/hdb1 internal create-md' terminated with exit coolpre 20
drbdadm create-md ha: exited with coolpre 20
解决方法:
因为忘了执行 fdisk /dev/hdb 指令建立分割区所造成,如下将 /dev/hdb 建立分割区后指令即可正常执行
# fdisk /dev/hdb //准备为 hdb 建立分割区
The number of cylinders for this disk is set to 20805.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
(e.g., DOS FDISK, OS/2 FDISK)
Command (m for help): n //键入 n 表示要建立分割区
Command action
e extended
p primary partition (1-4)
p //键入 p 表示建立主要分割区
Partition number (1-4): 1 //键入 1 为此主要分割区代号
First cylinder (1-20805, default 1): //开始磁柱值,按下 enter 即可
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-20805, default 20805): //结束磁柱值,按下 enter 即可
Using default value 20805
Command (m for help): w //键入 w 表示确定执行刚才设定
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
[root@node1 yum.repos.d]# partprobe //使刚才的 partition table 变更生效
问题3、drbdadm create-md ha: exited with coolpre 40?
错误信息:
执行指令 drbdadm create-md ha 时出现如下错误信息:
Device size would be truncated, which
would corrupt data and result in
'access beyond end of device' errors.
You need to either
* use external meta data (recommended)
* shrink that filesystem first
* zero out the device (destroy the filesystem)
Operation refused.
Command 'drbdmeta 0 v08 /dev/hdb1 internal create-md' terminated with exit coolpre 40
drbdadm create-md ha: exited with coolpre 40
解决方法:
使用 dd 指令将一些资料塞到 /dev/hdb 后再执行 drbdadm create-md ha 指令即可顺利执行
# dd if=/dev/zero of=/dev/hdb1 bs=1M count=100
问题4、DRBD 状态始终是 Secondary/Unknown?
错误信息:
Node1、Node2 主机启动 DRBD 后状态始终是 Secondary/Unknown
#service drbd status
drbd driver loaded OK; device status:
version: 8.3.8 (api:88/proto:86-94)
GIT-hash: d78846e52224fd00562f7c225bcc25b2d422321d build by mockbuild@builder10.centos.org, 2010-06-04 08:04:16
m:res cs ro ds p mounted fstype
0:ha WFConnection Secondary/Unknown Inconsistent/DUnknown C
解决方法:
1、Node1、Node2 没有打开相对应的 Port,请开启相对应的 Port 或先把 IPTables 服务关闭即可。
2、可能发生了脑裂行为,一般出现在ha切换时,解决方法:
在一节点执行:
drbdadm secondary resource
drbdadm connect –discard-my-data resource
另一节点执行:
drbdadm connect resource
问题5、1: Failure: (104) Can not open backing device
错误信息:
执行drbdadm up r0时出现:
1: Failure: (104) Can not open backing device.
Command 'drbdsetup attach 1 /dev/sdb1 /dev/sdb1 internal' terminated with exit pre 10
解决方法:
可能因为你挂载了/dev/sdb1,执行umount /dev/sdb1即可。