ceph在扩展mon节点时,要注意的问题

时间:2021-05-09 12:45:45

我开始也是一步一步的按官方文档操作。

但后来还是遇到了问题。

当我要扩展mon节点时,死活出错。

(我就一共用了三个节点ceph-admin, ceph-node1, ceph-node2)

比如:

ceph-deploy mon add ceph-node2

出错如下:

[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (
1.5.38): /usr/bin/ceph-deploy mon add ceph-node2
[ceph_deploy.cli][INFO ] ceph
-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : add
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf :
<ceph_deploy.conf.cephdeploy.Conf instance at 0xa89fc8>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] mon : [
'ceph-node2']
[ceph_deploy.cli][INFO ] func :
<function mon at 0xa826e0>
[ceph_deploy.cli][INFO ] address : None
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.mon][INFO ] ensuring configuration of new mon host: ceph
-node2
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph
-node2
[ceph
-node2][DEBUG ] connected to host: ceph-node2
[ceph
-node2][DEBUG ] detect platform information from remote host
[ceph
-node2][DEBUG ] detect machine type
[ceph
-node2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.mon][DEBUG ] Adding mon to cluster ceph, host ceph
-node2
[ceph_deploy.mon][DEBUG ] using mon address by resolving host:
192.168.1.113
[ceph_deploy.mon][DEBUG ] detecting platform
for host ceph-node2 ...
[ceph
-node2][DEBUG ] connected to host: ceph-node2
[ceph
-node2][DEBUG ] detect platform information from remote host
[ceph
-node2][DEBUG ] detect machine type
[ceph
-node2][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO ] distro
info: CentOS Linux 7.3.1611 Core
[ceph
-node2][DEBUG ] determining if provided host has same hostname in remote
[ceph
-node2][DEBUG ] get remote short hostname
[ceph
-node2][DEBUG ] adding mon to ceph-node2
[ceph
-node2][DEBUG ] get remote short hostname
[ceph
-node2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph
-node2][DEBUG ] create the mon path if it does not exist
[ceph
-node2][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-node2/done
[ceph
-node2][DEBUG ] create a done file to avoid re-doing the mon deployment
[ceph
-node2][DEBUG ] create the init path if it does not exist
[ceph
-node2][INFO ] Running command: systemctl enable ceph.target
[ceph
-node2][INFO ] Running command: systemctl enable ceph-mon@ceph-node2
[ceph
-node2][INFO ] Running command: systemctl start ceph-mon@ceph-node2
[ceph
-node2][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-node2.asok mon_status
[ceph
-node2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph
-node2][WARNIN] ceph-node2 is not defined in `mon initial members`
[ceph
-node2][WARNIN] monitor ceph-node2 does not exist in monmap
[ceph
-node2][WARNIN] neither `public_addr` nor `public_network` keys are defined for monitors
[ceph
-node2][WARNIN] monitors may not be able to form quorum
[ceph
-node2][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-node2.asok mon_status
[ceph
-node2][ERROR ] admin_socket: exception getting command descriptions: [Errno 2] No such file or directory
[ceph
-node2][WARNIN] monitor: mon.ceph-node2, might not be running yet

后来,查找了几个方案,原来要在ceph.conf定义一下Public_network(还有其它几个解决方案,我选择了这个靠谱的)。

当我更新完ceph.conf时,运行同样的命令会再报错:

[ceph_deploy.conf][DEBUG ] found configuration file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (
1.5.38): /usr/bin/ceph-deploy mon add ceph-admin
[ceph_deploy.cli][INFO ] ceph
-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : add
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf :
<ceph_deploy.conf.cephdeploy.Conf instance at 0x2296fc8>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] mon : [
'ceph-admin']
[ceph_deploy.cli][INFO ] func :
<function mon at 0x228f6e0>
[ceph_deploy.cli][INFO ] address : None
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.mon][INFO ] ensuring configuration of new mon host: ceph
-admin
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph
-admin
[ceph
-admin][DEBUG ] connected to host: ceph-admin
[ceph
-admin][DEBUG ] detect platform information from remote host
[ceph
-admin][DEBUG ] detect machine type
[ceph
-admin][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.admin][ERROR ] RuntimeError: config
file /etc/ceph/ceph.conf exists with different content; use --overwrite-conf to overwrite
[ceph_deploy][ERROR ] GenericError: Failed to configure 1 admin hosts

于是,网上又找方案,应该是conf不同步所致,几圈下来,--overwrite的使用各不相同,我自己用-h摸索出来的命令是:

ceph-deploy --overwrite-conf config push ceph-admin ceph-node1 ceph-node2

---为保险,我再推了一次
ceph
-deploy admin ceph-node1 ceph-node2

然后,再运行ceph-deploy mon add ceph-node2命令,成功输出如下:

[root@ceph-admin my-ceph-cluster]# ceph-deploy mon add ceph-node2 
[ceph_deploy.conf][DEBUG ] found configuration
file at: /root/.cephdeploy.conf
[ceph_deploy.cli][INFO ] Invoked (
1.5.38): /usr/bin/ceph-deploy mon add ceph-node2
[ceph_deploy.cli][INFO ] ceph
-deploy options:
[ceph_deploy.cli][INFO ] username : None
[ceph_deploy.cli][INFO ] verbose : False
[ceph_deploy.cli][INFO ] overwrite_conf : False
[ceph_deploy.cli][INFO ] subcommand : add
[ceph_deploy.cli][INFO ] quiet : False
[ceph_deploy.cli][INFO ] cd_conf :
<ceph_deploy.conf.cephdeploy.Conf instance at 0x10c8fc8>
[ceph_deploy.cli][INFO ] cluster : ceph
[ceph_deploy.cli][INFO ] mon : [
'ceph-node2']
[ceph_deploy.cli][INFO ] func :
<function mon at 0x10c16e0>
[ceph_deploy.cli][INFO ] address : None
[ceph_deploy.cli][INFO ] ceph_conf : None
[ceph_deploy.cli][INFO ] default_release : False
[ceph_deploy.mon][INFO ] ensuring configuration of new mon host: ceph
-node2
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to ceph
-node2
[ceph
-node2][DEBUG ] connected to host: ceph-node2
[ceph
-node2][DEBUG ] detect platform information from remote host
[ceph
-node2][DEBUG ] detect machine type
[ceph
-node2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.mon][DEBUG ] Adding mon to cluster ceph, host ceph
-node2
[ceph_deploy.mon][DEBUG ] using mon address by resolving host:
192.168.1.113
[ceph_deploy.mon][DEBUG ] detecting platform
for host ceph-node2 ...
[ceph
-node2][DEBUG ] connected to host: ceph-node2
[ceph
-node2][DEBUG ] detect platform information from remote host
[ceph
-node2][DEBUG ] detect machine type
[ceph
-node2][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO ] distro
info: CentOS Linux 7.3.1611 Core
[ceph
-node2][DEBUG ] determining if provided host has same hostname in remote
[ceph
-node2][DEBUG ] get remote short hostname
[ceph
-node2][DEBUG ] adding mon to ceph-node2
[ceph
-node2][DEBUG ] get remote short hostname
[ceph
-node2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph
-node2][DEBUG ] create the mon path if it does not exist
[ceph
-node2][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-node2/done
[ceph
-node2][DEBUG ] create a done file to avoid re-doing the mon deployment
[ceph
-node2][DEBUG ] create the init path if it does not exist
[ceph
-node2][INFO ] Running command: systemctl enable ceph.target
[ceph
-node2][INFO ] Running command: systemctl enable ceph-mon@ceph-node2
[ceph
-node2][INFO ] Running command: systemctl start ceph-mon@ceph-node2
[ceph
-node2][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-node2.asok mon_status
[ceph
-node2][WARNIN] ceph-node2 is not defined in `mon initial members`
[ceph
-node2][WARNIN] monitor ceph-node2 does not exist in monmap
[ceph
-node2][INFO ] Running command: ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-node2.asok mon_status
[ceph
-node2][DEBUG ] ********************************************************************************
[ceph
-node2][DEBUG ] status for monitor: mon.ceph-node2
[ceph
-node2][DEBUG ] {
[ceph
-node2][DEBUG ] "election_epoch": 0,
[ceph
-node2][DEBUG ] "extra_probe_peers": [],
[ceph
-node2][DEBUG ] "monmap": {
[ceph
-node2][DEBUG ] "created": "2017-08-12 08:25:17.590053",
[ceph
-node2][DEBUG ] "epoch": 2,
[ceph
-node2][DEBUG ] "fsid": "798ed076-8094-429e-9e27-0ffccd60b56e",
[ceph
-node2][DEBUG ] "modified": "2017-08-12 20:40:03.171628",
[ceph
-node2][DEBUG ] "mons": [
[ceph
-node2][DEBUG ] {
[ceph
-node2][DEBUG ] "addr": "192.168.1.111:6789/0",
[ceph
-node2][DEBUG ] "name": "ceph-admin",
[ceph
-node2][DEBUG ] "rank": 0
[ceph
-node2][DEBUG ] },
[ceph
-node2][DEBUG ] {
[ceph
-node2][DEBUG ] "addr": "192.168.1.112:6789/0",
[ceph
-node2][DEBUG ] "name": "ceph-node1",
[ceph
-node2][DEBUG ] "rank": 1
[ceph
-node2][DEBUG ] }
[ceph
-node2][DEBUG ] ]
[ceph
-node2][DEBUG ] },
[ceph
-node2][DEBUG ] "name": "ceph-node2",
[ceph
-node2][DEBUG ] "outside_quorum": [],
[ceph
-node2][DEBUG ] "quorum": [],
[ceph
-node2][DEBUG ] "rank": -1,
[ceph
-node2][DEBUG ] "state": "probing",
[ceph
-node2][DEBUG ] "sync_provider": []
[ceph
-node2][DEBUG ] }
[ceph
-node2][DEBUG ] ********************************************************************************
[ceph
-node2][INFO ] monitor: mon.ceph-node2 is currently at the state of probing

最后,查看用下面的命令检查法定人数状态:

ceph quorum_status --format json-pretty

ceph在扩展mon节点时,要注意的问题