操作环境
Centos 6.6 x86_64
Ceph 0.87
ceph-deploy 1.59
操作步骤
添加mon节点
ceph monitor的个数是2n+1(n>=0)个,在线上至少3个,只要正常的节点数>=n+1,ceph的paxos算法能保证系统的正常运行。所以,对于3个节点,同时只能挂掉一个。
当前ceph cluster中仅含有1个mon节点,将其扩展至3个mon节点。
查看当前ceph cluster状态
[root@ceph-osd-1 ceph-cluster]# ceph -s
cluster 9d717e10-a708-482d-b91c-4bd21f4ae36c
health HEALTH_OK
monmap e5: 1 mons at {ceph-osd-1=10.10.200.163:6789/0}, election epoch 69, quorum 0 ceph-osd-1
osdmap e220: 7 osds: 7 up, 7 in
pgmap v473: 256 pgs, 1 pools, 0 bytes data, 0 objects
36109 MB used, 11870 GB / 11905 GB avail
256 active+clean
此时要向ceph cluster中添加两个节点分别为ceph-osd-2,ceph-osd-3
首先修改配置文件如下,添加public_network
[global]
auth_service_required = cephx
filestore_xattr_use_omap = true
auth_client_required = cephx
auth_cluster_required = cephx
mon_host = 10.10.200.163, 10.10.200.164
mon_initial_members = ceph-osd-1, ceph-osd-2
fsid = 9d717e10-a708-482d-b91c-4bd21f4ae36c
public_network = 10.10.200.0/24
添加mon节点
[root@ceph-osd-1 ceph-cluster]# ceph-deploy mon create ceph-osd-2 ceph-osd-3</span>
查看添加mon节点后,查看mon quorum状态信息
[root@ceph-osd-1 ceph-cluster]# ceph quorum_status --format json-pretty
{ "election_epoch": 72,
"quorum": [
0,
1,
2],
"quorum_names": [
"ceph-osd-1",
"ceph-osd-2",
"ceph-osd-3"],
"quorum_leader_name": "ceph-osd-1",
"monmap": { "epoch": 7,
"fsid": "9d717e10-a708-482d-b91c-4bd21f4ae36c",
"modified": "2014-11-14 09:10:28.111133",
"created": "0.000000",
"mons": [
{ "rank": 0,
"name": "ceph-osd-1",
"addr": "10.10.200.163:6789\/0"},
{ "rank": 1,
"name": "ceph-osd-2",
"addr": "10.10.200.164:6789\/0"},
{ "rank": 2,
"name": "ceph-osd-3",
"addr": "10.10.200.165:6789\/0"}]}}
查看此时ceph cluster状态
[root@ceph-osd-1 ceph-cluster]# ceph -s
cluster 9d717e10-a708-482d-b91c-4bd21f4ae36c
health HEALTH_WARN clock skew detected on mon.ceph-osd-3
monmap e7: 3 mons at {ceph-osd-1=10.10.200.163:6789/0,ceph-osd-2=10.10.200.164:6789/0,ceph-osd-3=10.10.200.165:6789/0}, election epoch 72, quorum 0,1,2 ceph-osd-1,ceph-osd-2,ceph-osd-3
osdmap e220: 7 osds: 7 up, 7 in
pgmap v475: 256 pgs, 1 pools, 0 bytes data, 0 objects
36109 MB used, 11870 GB / 11905 GB avail
256 active+clean
可以发现mon.ceph-osd-3节点的时间与mon.ceph-osd-1的时间不同步,同步各mon节点的时间。
此时ceph mon节点已经添加完毕,模拟ceph-osd-1 mon节点故障,查看ceph cluster能否正常工作,查看此时ceph cluster信息
[root@ceph-osd-2 ~]# ceph -s
2014-11-14 09:27:28.582467 7f9cd8712700 0 -- :/1014338 >> 10.10.200.163:6789/0 pipe(0x7f9cd4024230 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f9cd40244c0).fault
cluster 9d717e10-a708-482d-b91c-4bd21f4ae36c
health HEALTH_WARN 256 pgs degraded; 256 pgs stuck unclean; 256 pgs undersized; 1/7 in osds are down; 1 mons down, quorum 1,2 ceph-osd-2,ceph-osd-3
monmap e7: 3 mons at {ceph-osd-1=10.10.200.163:6789/0,ceph-osd-2=10.10.200.164:6789/0,ceph-osd-3=10.10.200.165:6789/0}, election epoch 88, quorum 1,2 ceph-osd-2,ceph-osd-3
osdmap e263: 7 osds: 6 up, 7 in
pgmap v542: 256 pgs, 1 pools, 0 bytes data, 0 objects
36112 MB used, 11870 GB / 11905 GB avail
256 active+undersized+degraded
因为ceph-osd-1节点上面拥有1个mon节点以及1个osd节点,所以在osd cluster中,有个osd也处于down状态。
本文的开头部分讲过,ceph mon规定在3个节点的状态下,只允许1个mon节点down,那么2个mon节点down会怎么样,继续down掉ceph-osd-2节点
通过ceph -s查看此时ceph cluster状态信息
[root@ceph-osd-3 ~]# ceph -s
2014-11-14 09:30:23.483264 7f677c28b700 0 -- :/1014680 >> 10.10.200.163:6789/0 pipe(0x7f6778023290 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f6778023520).fault
2014-11-14 09:30:26.483313 7f677c18a700 0 -- :/1014680 >> 10.10.200.164:6789/0 pipe(0x7f676c000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f676c000e90).fault
2014-11-14 09:30:29.483664 7f677c28b700 0 -- :/1014680 >> 10.10.200.163:6789/0 pipe(0x7f676c0030e0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f676c003370).fault
2014-11-14 09:30:32.483904 7f677c18a700 0 -- :/1014680 >> 10.10.200.164:6789/0 pipe(0x7f676c003a00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f676c003c90).fault
2014-11-14 09:30:35.484221 7f677c28b700 0 -- :/1014680 >> 10.10.200.163:6789/0 pipe(0x7f676c0031b0 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f676c002570).fault
2014-11-14 09:30:38.484476 7f677c18a700 0 -- :/1014680 >> 10.10.200.164:6789/0 pipe(0x7f676c002a60 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f676c002cf0).fault
通过以上信息,ceph cluster已经无法正常工作。所以在3节点的mon cluster中,仅允许1个mon节点down掉。
删除mon节点
在当前环境中,拥有3个mon节点,删除其中2个
[root@ceph-osd-1 ceph-cluster]# ceph-deploy mon destroy ceph-osd-2 ceph-osd-3
[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (1.5.19): /usr/bin/ceph-deploy mon destroy ceph-osd-2 ceph-osd-3
[ceph_deploy.mon][DEBUG ] Removing mon from ceph-osd-2
[ceph-osd-2][DEBUG ] connected to host: ceph-osd-2
[ceph-osd-2][DEBUG ] detect platform information from remote host
[ceph-osd-2][DEBUG ] detect machine type
[ceph-osd-2][DEBUG ] get remote short hostname
[ceph-osd-2][INFO ] Running command: ceph --cluster=ceph -n mon. -k /var/lib/ceph/mon/ceph-ceph-osd-2/keyring mon remove ceph-osd-2
[ceph-osd-2][WARNIN] removed mon.ceph-osd-2 at 10.10.200.164:6789/0, there are now 1 monitors
[ceph-osd-2][INFO ] polling the daemon to verify it stopped
[ceph-osd-2][INFO ] Running command: service ceph status mon.ceph-osd-2
[ceph-osd-2][INFO ] Running command: mkdir -p /var/lib/ceph/mon-removed
[ceph-osd-2][DEBUG ] move old monitor data
[ceph_deploy.mon][DEBUG ] Removing mon from ceph-osd-3
[ceph-osd-3][DEBUG ] connected to host: ceph-osd-3
[ceph-osd-3][DEBUG ] detect platform information from remote host
[ceph-osd-3][DEBUG ] detect machine type
[ceph-osd-3][DEBUG ] get remote short hostname
查看删除2个mon节点后,ceph cluster的状态
[root@ceph-osd-1 ceph-cluster]# ceph -s
cluster 9d717e10-a708-482d-b91c-4bd21f4ae36c
health HEALTH_OK
monmap e5: 1 mons at {ceph-osd-1=10.10.200.163:6789/0}, election epoch 69, quorum 0 ceph-osd-1
osdmap e220: 7 osds: 7 up, 7 in
pgmap v473: 256 pgs, 1 pools, 0 bytes data, 0 objects
36109 MB used, 11870 GB / 11905 GB avail
256 active+clean