Ceph Jewel 10.2.3 环境部署

时间:2021-03-09 16:51:00

Ceph 测试环境部署

本文档内容概要

  • 测试环境ceph集群部署规划
  • 测试环境ceph集群部署过程及块设备使用流程
  • mon节点扩容及osd节点扩容方法
  • 常见问题及解决方法

由于暂时没有用到对象存储,所以暂时没有配对象存储的网关。

==回答:为什么docker里用到ceph?==

环境里面每台机器挂载了个1T的数据盘,为了充分利用集群里所有数据磁盘的空间,使用ceph构建分布式环境,将数据盘联合到一起,看成一个盘。当然,这个主要是ceph的快存储功能。

集群部署规划

主机角色规划

主机名 系统 内核版本 IP地址 角色 部署服务
docker-rancher-server CentOS 7.1.1503 3.10.0-229 10.142.246.2 mon、osd
docker-rancher-client1 CentOS 7.1.1503 3.10.0-229 10.142.246.3 mon、osd
docker-rancher-client2 CentOS 7.1.1503 3.10.0-229 10.142.246.4 osd
hub.chinatelecom.cn CentOS 7.1.1503 3.10.0-229 10.142.246.5 osd

部署架构图

Ceph Jewel 10.2.3 环境部署

集群基础环境准备

基础环境是所有节点都需要做的,以下主要以docker-rancher-server为例做,其他三台雷同

0. 检查系统版本信息

四台机器都是一样的虚拟机,其中一台版本信息如下:

[op@docker-rancher-server ~]$ cat /etc/redhat-release
CentOS Linux release 7.1.1503 (Core)
[op@docker-rancher-server ~]$ uname -r
3.10.0-229.el7.x86_64

1. 做域名解析

[op@docker-rancher-server ~]$ cat /etc/hosts
10.142.246.2  docker-rancher-server
10.142.246.3  docker-rancher-client1
10.142.246.4  docker-rancher-client2
10.142.246.5  hub.chinatelecom.cn    hub

2. 防火墙策略

ceph默认使用的端口

Ceph Monitors 之间默认使用 ==6789== 端口通信, OSD 之间默认用 ==6800:7300== 这个范围内的端口通信,CentOS7默认使用的是firewall作为防火墙,不过我们已经改为iptables,所以直接在iptables里开放对应端口。

# 命令格式
sudo iptables -A INPUT -i {iface} -p tcp -s {ip-address}/{netmask} --dport 6789 -j ACCEPT
# 实战
[op@docker-rancher-server ~]$ sudo iptables -A INPUT -i eth0 -p tcp -s 10.142.0.0/16 --dport 6789 -j ACCEPT
[op@docker-rancher-server ~]$ sudo iptables -A INPUT -i eth0 -p tcp -s 10.142.0.0/16 --dport 6800:7300 -j ACCEPT
# 验证
[op@docker-rancher-server ~]$ sudo iptables -L
[op@docker-rancher-server ~]$ sudo iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:zabbix-agent
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:8514
ACCEPT     udp  --  anywhere             anywhere             udp dpt:8514
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:shell
ACCEPT     udp  --  anywhere             anywhere             udp dpt:ipsec-nat-t
ACCEPT     udp  --  anywhere             anywhere             udp dpt:isakmp
ACCEPT     tcp  --  10.142.0.0/16        anywhere             tcp dpt:smc-https
ACCEPT     tcp  --  10.142.0.0/16        anywhere             tcp dpts:6800:7300
# 保存当前策略
[op@docker-rancher-server ~]$ sudo service iptables save
iptables: Saving firewall rules to /etc/sysconfig/iptables:[  确定  ]

可以看到有有两条规则开放了相应端口

3. NTP时间同步

选用docker-rancher-server作为NTP时间服务器基准,其它三台同步时间到docker-rancher-server上

# 安装NTP服务,所有节点都需要
[op@docker-rancher-server ~]$ sudo yum install ntp -y
[op@docker-rancher-server ~]$ sudo vim /etc/ntp.conf
# 允许内网其他机器同步时间
restrict 10.142.0.0 mask 255.255.0.0 nomodify notrap

server 10.142.246.2

# 外部时间服务器不可用时,以本地时间作为时间服务
server  127.127.1.0     # local clock
fudge   127.127.1.0 stratum 10

# 另外把其他的server都注释掉

# 启动服务
[op@docker-rancher-server ~]$ sudo systemctl restart  ntpd.service

# 等上几分钟,看到
[op@docker-rancher-server ~]$ ntpstat
synchronised to local net at stratum 11
   time correct to within 7948 ms
   polling server every 64 s
[op@docker-rancher-server ~]$ ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 docker-rancher- .INIT.          16 u    -   64    0    0.000    0.000   0.000
*LOCAL(0)        .LOCL.          10 l   19   64    3    0.000    0.000   0.000

# 把配置文件分发到其他几个节点
# 启动服务
[op@docker-rancher-server ~]$ sudo systemctl restart  ntpd.service
# 此处要等很久,可以先处理后面的,后期再来查看
[op@docker-rancher-client1 ~]$ ntpstat
synchronised to NTP server (10.142.246.2) at stratum 12
   time correct to within 29 ms
   polling server every 1024 s
[op@docker-rancher-client1 ~]$ ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*docker-rancher- LOCAL(0)        11 u   51  512  377    1.700    1.735   1.302
 LOCAL(0)        .LOCL.          10 l 220m   64    0    0.000    0.000   0.000

4. 导入epel、ceph源

关于epel源,ceph源的制作,请看同步各种源一文。

之前我已经把epel的源放进去了,检查配置一下

[op@docker-rancher-server yum.repos.d]$ sudo vim /etc/yum.repos.d/epel.repo
[epel]
name=Extra Packages for Enterprise Linux 7 - x86_64
baseurl=http://10.142.78.40/epel/7/x86_64
failovermethod=priority
enabled=1
gpgcheck=1
gpgkey=http://10.142.78.40/epel/RPM-GPG-KEY-EPEL-7

[epel-debuginfo]
name=Extra Packages for Enterprise Linux 7 - x86_64 - Debug
baseurl=http://10.142.78.40/epel/7/x86_64/debug
failovermethod=priority
enabled=0
gpgkey=http://10.142.78.40/epel/RPM-GPG-KEY-EPEL-7
gpgcheck=1
priority=2

# 验证
[op@docker-rancher-server ~]$ yum repolist
已加载插件:fastestmirror
Loading mirror speeds from cached hostfile
源标识    源名称                                            状态
base      RHEL-7 - Base - http                               8,652
epel      Extra Packages for Enterprise Linux 7 - x86_64    10,846
updates   CentOS-7 - Updates                                 3,723
repolist: 23,221

Ceph的源也已经放在公司内部了,添加一下

[op@docker-rancher-server ~]$ sudo vim /etc/yum.repos.d/ceph.repo
[ceph]
name=Ceph packages for x86_64
baseurl=http://10.142.78.40/ceph/rpm-jewel/el7/x86_64
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=http://10.142.78.40/ceph/keys/release.asc
priority=1

[ceph-noarch]
name=Ceph noarch packages
baseurl=http://10.142.78.40/ceph/rpm-jewel/el7/noarch
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=http://10.142.78.40/ceph/keys/release.asc
priority=1

[ceph-source]
name=Ceph source packages
baseurl=http://10.142.78.40/ceph/rpm-jewel/el7/SRPMS
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=http://10.142.78.40/ceph/keys/release.asc
priority=1

# 验证
[op@docker-rancher-server ~]$ yum repolist
已加载插件:fastestmirror
Loading mirror speeds from cached hostfile
源标识      源名称                                          状态
base        RHEL-7 - Base - http                             8,652
ceph        Ceph packages for x86_64                           231
ceph-noarch Ceph noarch packages                                12
ceph-source Ceph source packages                                 0
epel        Extra Packages for Enterprise Linux 7 - x86_64  10,846
updates     CentOS-7 - Updates                               3,723
repolist: 23,464

5. 创建ceph以外的用户

默认公司的服务器有op用户,不需要再创建

另外,==一定要赋给sudo权限==

6. 节点直接无密钥访问

此处比较简单,不再赘述

另外,官网推荐配置一下 ~/.ssh/config 文件。,这样 ceph-deploy 就能用你所建的用户名登录 Ceph 节点了,而无需每次执行 ceph-deploy 都要指定 --username {username} 。这样做同时也简化了 ssh 和scp 的用法。

[op@docker-rancher-server ~]$ vim ~/.ssh/config
Host ceph-node1   # 相当于别名
   Hostname docker-rancher-server  # 实际主机名
   User op                      # 实际连接时用户
Host ceph-node2
   Hostname docker-rancher-client1
   User op
Host ceph-node3
   Hostname docker-rancher-client2
   User op
Host ceph-node4
   Hostname hub.chinatelecom.cn
   User op

# 更改一下权限,一定要更改,否则不能用
[op@docker-rancher-server ~]$ chmod 600 .ssh/*

# 此处更改的意义在于,比如我使用root用户登录,配置一下config文件,可以使用root时用op连接

7. 设置了 requiretty

在 CentOS 和 RHEL 上执行 ceph-deploy 命令时可能会报错。如果你的 Ceph 节点默认设置了 requiretty ,执行 sudo visudo 禁用它,并找到 Defaults requiretty 选项,把它改为 Defaults:ceph !requiretty 或者直接注释掉,这样 ceph-deploy 就可以用之前创建的用户(创建部署 Ceph 的用户 )连接了。

# 所有节点执行,直接注掉
[op@docker-rancher-server ~]$ sudo vim /etc/sudoers
# Defaults    requiretty

8. 禁用selinux

vim /etc/selinux/config
SELINUX=disabled
# 立即生效
[op@ceph-node1 ~]$ sudo setenforce 0

9. 安装ceph-deploy

yum install ceph-deploy -y

集群环境部署

参考网站:ceph官网

以下在admin上操作

==Important:如果你是用另一普通用户登录的,不要用 sudo 或在 root 身份运行 ceph-deploy ,因为它不会在远程主机上调用所需的 sudo 命令。==

1. 创建集群

# 建立集群目录,未来一些配置文件会生成在这个目录下
[op@docker-rancher-server ~]$ mkdir ceph && cd ceph
# 创建monitor(至少一个)
[op@ceph-node1 ceph]$ ceph-deploy new docker-rancher-server docker-rancher-client1

# 验证是否产生配置文件
[op@docker-rancher-server ceph]$ ls
ceph.conf  ceph-deploy-ceph.log  ceph.mon.keyring

# 修改默认配置文件
vim ceph.conf
osd pool default min sisz=2
osd pool default size = 3

# 如果有多块网卡,可以配置数据交互使用万兆网卡,测试暂时不具备相应条件
# mon_clock_drift_allowed=5  # 单位是ms
# osd_pool_default_crush_rule=0
# osd_crush_chooseleaf_type=1
# public network=10.10.0.0/24  # 公网IP地址
# cluster network=192.168.0.0/24 # 内网IP地址

2. 安装ceph

[op@docker-rancher-server ceph]$ ceph-deploy install docker-rancher-server docker-rancher-client1 docker-rancher-client2 hub.chinatelecom.cn
ceph_deploy.conf][DEBUG ] found configuration file at: /usr/op/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.36): /usr/bin/ceph-deploy install docker-rancher-server docker-rancher-client1 docker-rancher-client2 hub.chinatelecom.cn
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  testing                       : None
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f87fd8a7d40>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  dev_commit                    : None
[ceph_deploy.cli][INFO  ]  install_mds                   : False
[ceph_deploy.cli][INFO  ]  stable                        : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  adjust_repos                  : True
[ceph_deploy.cli][INFO  ]  func                          : <function install at 0x7f87fe79a578>
[ceph_deploy.cli][INFO  ]  install_all                   : False
[ceph_deploy.cli][INFO  ]  repo                          : False
[ceph_deploy.cli][INFO  ]  host                          : ['docker-rancher-server', 'docker-rancher-client1', 'docker-rancher-client2', 'hub.chinatelecom.cn']
[ceph_deploy.cli][INFO  ]  install_rgw                   : False
[ceph_deploy.cli][INFO  ]  install_tests                 : False
[ceph_deploy.cli][INFO  ]  repo_url                      : None
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  install_osd                   : False
[ceph_deploy.cli][INFO  ]  version_kind                  : stable
[ceph_deploy.cli][INFO  ]  install_common                : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  dev                           : master
[ceph_deploy.cli][INFO  ]  nogpgcheck                    : False
[ceph_deploy.cli][INFO  ]  local_mirror                  : None
[ceph_deploy.cli][INFO  ]  release                       : None
[ceph_deploy.cli][INFO  ]  install_mon                   : False
[ceph_deploy.cli][INFO  ]  gpg_url                       : None
[ceph_deploy.install][DEBUG ] Installing stable version jewel on cluster ceph hosts docker-rancher-server docker-rancher-client1 docker-rancher-client2 hub.chinatelecom.cn
[ceph_deploy.install][DEBUG ] Detecting platform for host docker-rancher-server ...
[docker-rancher-server][DEBUG ] connection detected need for sudo
[docker-rancher-server][DEBUG ] connected to host: docker-rancher-server
[docker-rancher-server][DEBUG ] detect platform information from remote host
[docker-rancher-server][DEBUG ] detect machine type
[ceph_deploy.install][INFO  ] Distro info: CentOS Linux 7.1.1503 Core
[docker-rancher-server][INFO  ] installing Ceph on docker-rancher-server
******
[hub.chinatelecom.cn][INFO  ] Running command: sudo ceph --version
[hub.chinatelecom.cn][DEBUG ] ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b)

初始化集群

[op@docker-rancher-server ceph]$ ceph-deploy mon create-initial
[ceph_deploy.conf][DEBUG ] found configuration file at: /usr/op/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.36): /usr/bin/ceph-deploy mon create-initial
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : create-initial
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0xd5c710>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          : <function mon at 0xd541b8>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  keyrings                      : None
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts docker-rancher-server docker-rancher-client1
[ceph_deploy.mon][DEBUG ] detecting platform for host docker-rancher-server ...
[docker-rancher-server][DEBUG ] connection detected need for sudo
[docker-rancher-server][DEBUG ] connected to host: docker-rancher-server
[docker-rancher-server][DEBUG ] detect platform information from remote host
[docker-rancher-server][DEBUG ] detect machine type
[docker-rancher-server][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO  ] distro info: CentOS Linux 7.1.1503 Core
[docker-rancher-server][DEBUG ] determining if provided host has same hostname in remote
[docker-rancher-server][DEBUG ] get remote short hostname
[docker-rancher-server][DEBUG ] deploying mon to docker-rancher-server
[docker-rancher-server][DEBUG ] get remote short hostname
[docker-rancher-server][DEBUG ] remote hostname: docker-rancher-server
[docker-rancher-server][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[docker-rancher-server][DEBUG ] create the mon path if it does not exist
[docker-rancher-server][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-docker-rancher-server/done
[docker-rancher-server][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-docker-rancher-server/done
[docker-rancher-server][INFO  ] creating keyring file: /var/lib/ceph/tmp/ceph-docker-rancher-server.mon.keyring
[docker-rancher-server][DEBUG ] create the monitor keyring file
[docker-rancher-server][INFO  ] Running command: sudo ceph-mon --cluster ceph --mkfs -i docker-rancher-server --keyring /var/lib/ceph/tmp/ceph-docker-rancher-server.mon.keyring --setuser 167 --setgroup 167
[docker-rancher-server][DEBUG ] ceph-mon: mon.noname-a 10.142.246.2:6789/0 is local, renaming to mon.docker-rancher-server
[docker-rancher-server][DEBUG ] ceph-mon: set fsid to ef81681c-ee15-412e-a752-2c3e87b9e369
[docker-rancher-server][DEBUG ] ceph-mon: created monfs at /var/lib/ceph/mon/ceph-docker-rancher-server for mon.docker-rancher-server
[docker-rancher-server][INFO  ] unlinking keyring file /var/lib/ceph/tmp/ceph-docker-rancher-server.mon.keyring
[docker-rancher-server][DEBUG ] create a done file to avoid re-doing the mon deployment
[docker-rancher-server][DEBUG ] create the init path if it does not exist
[docker-rancher-server][INFO  ] Running command: sudo systemctl enable ceph.target
[docker-rancher-server][INFO  ] Running command: sudo systemctl enable ceph-mon@docker-rancher-server
[docker-rancher-server][WARNIN] Created symlink from /etc/systemd/system/ceph-mon.target.wants/ceph-mon@docker-rancher-server.service to /usr/lib/systemd/system/ceph-mon@.service.
[docker-rancher-server][INFO  ] Running command: sudo systemctl start ceph-mon@docker-rancher-server
[docker-rancher-server][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.docker-rancher-server.asok mon_status
[docker-rancher-server][DEBUG ] ********************************************************************************
[docker-rancher-server][DEBUG ] status for monitor: mon.docker-rancher-server
[docker-rancher-server][DEBUG ] {
[docker-rancher-server][DEBUG ]   "election_epoch": 0,
[docker-rancher-server][DEBUG ]   "extra_probe_peers": [
[docker-rancher-server][DEBUG ]     "10.142.246.3:6789/0"
[docker-rancher-server][DEBUG ]   ],
[docker-rancher-server][DEBUG ]   "monmap": {
[docker-rancher-server][DEBUG ]     "created": "2016-11-28 12:38:30.861132",
[docker-rancher-server][DEBUG ]     "epoch": 0,
[docker-rancher-server][DEBUG ]     "fsid": "ef81681c-ee15-412e-a752-2c3e87b9e369",
[docker-rancher-server][DEBUG ]     "modified": "2016-11-28 12:38:30.861132",
[docker-rancher-server][DEBUG ]     "mons": [
[docker-rancher-server][DEBUG ]       {
[docker-rancher-server][DEBUG ]         "addr": "10.142.246.2:6789/0",
[docker-rancher-server][DEBUG ]         "name": "docker-rancher-server",
[docker-rancher-server][DEBUG ]         "rank": 0
[docker-rancher-server][DEBUG ]       },
[docker-rancher-server][DEBUG ]       {
[docker-rancher-server][DEBUG ]         "addr": "0.0.0.0:0/1",
[docker-rancher-server][DEBUG ]         "name": "docker-rancher-client1",
[docker-rancher-server][DEBUG ]         "rank": 1
[docker-rancher-server][DEBUG ]       }
[docker-rancher-server][DEBUG ]     ]
[docker-rancher-server][DEBUG ]   },
[docker-rancher-server][DEBUG ]   "name": "docker-rancher-server",
[docker-rancher-server][DEBUG ]   "outside_quorum": [
[docker-rancher-server][DEBUG ]     "docker-rancher-server"
[docker-rancher-server][DEBUG ]   ],
[docker-rancher-server][DEBUG ]   "quorum": [],
[docker-rancher-server][DEBUG ]   "rank": 0,
[docker-rancher-server][DEBUG ]   "state": "probing",
[docker-rancher-server][DEBUG ]   "sync_provider": []
[docker-rancher-server][DEBUG ] }
[docker-rancher-server][DEBUG ] ********************************************************************************
[docker-rancher-server][INFO  ] monitor: mon.docker-rancher-server is running
[docker-rancher-server][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.docker-rancher-server.asok mon_status
[ceph_deploy.mon][DEBUG ] detecting platform for host docker-rancher-client1 ...
[docker-rancher-client1][DEBUG ] connection detected need for sudo
[docker-rancher-client1][DEBUG ] connected to host: docker-rancher-client1
[docker-rancher-client1][DEBUG ] detect platform information from remote host
[docker-rancher-client1][DEBUG ] detect machine type
[docker-rancher-client1][DEBUG ] find the location of an executable
[ceph_deploy.mon][INFO  ] distro info: CentOS Linux 7.1.1503 Core
[docker-rancher-client1][DEBUG ] determining if provided host has same hostname in remote
[docker-rancher-client1][DEBUG ] get remote short hostname
[docker-rancher-client1][DEBUG ] deploying mon to docker-rancher-client1
[docker-rancher-client1][DEBUG ] get remote short hostname
[docker-rancher-client1][DEBUG ] remote hostname: docker-rancher-client1
[docker-rancher-client1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[docker-rancher-client1][DEBUG ] create the mon path if it does not exist
[docker-rancher-client1][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-docker-rancher-client1/done
[docker-rancher-client1][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-docker-rancher-client1/done
[docker-rancher-client1][INFO  ] creating keyring file: /var/lib/ceph/tmp/ceph-docker-rancher-client1.mon.keyring
[docker-rancher-client1][DEBUG ] create the monitor keyring file
[docker-rancher-client1][INFO  ] Running command: sudo ceph-mon --cluster ceph --mkfs -i docker-rancher-client1 --keyring /var/lib/ceph/tmp/ceph-docker-rancher-client1.mon.keyring --setuser 167 --setgroup 167
[docker-rancher-client1][DEBUG ] ceph-mon: mon.noname-b 10.142.246.3:6789/0 is local, renaming to mon.docker-rancher-client1
[docker-rancher-client1][DEBUG ] ceph-mon: set fsid to ef81681c-ee15-412e-a752-2c3e87b9e369
[docker-rancher-client1][DEBUG ] ceph-mon: created monfs at /var/lib/ceph/mon/ceph-docker-rancher-client1 for mon.docker-rancher-client1
[docker-rancher-client1][INFO  ] unlinking keyring file /var/lib/ceph/tmp/ceph-docker-rancher-client1.mon.keyring
[docker-rancher-client1][DEBUG ] create a done file to avoid re-doing the mon deployment
[docker-rancher-client1][DEBUG ] create the init path if it does not exist
[docker-rancher-client1][INFO  ] Running command: sudo systemctl enable ceph.target
[docker-rancher-client1][INFO  ] Running command: sudo systemctl enable ceph-mon@docker-rancher-client1
[docker-rancher-client1][WARNIN] Created symlink from /etc/systemd/system/ceph-mon.target.wants/ceph-mon@docker-rancher-client1.service to /usr/lib/systemd/system/ceph-mon@.service.
[docker-rancher-client1][INFO  ] Running command: sudo systemctl start ceph-mon@docker-rancher-client1
[docker-rancher-client1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.docker-rancher-client1.asok mon_status
[docker-rancher-client1][DEBUG ] ********************************************************************************
[docker-rancher-client1][DEBUG ] status for monitor: mon.docker-rancher-client1
[docker-rancher-client1][DEBUG ] {
[docker-rancher-client1][DEBUG ]   "election_epoch": 2,
[docker-rancher-client1][DEBUG ]   "extra_probe_peers": [
[docker-rancher-client1][DEBUG ]     "10.142.246.2:6789/0"
[docker-rancher-client1][DEBUG ]   ],
[docker-rancher-client1][DEBUG ]   "monmap": {
[docker-rancher-client1][DEBUG ]     "created": "2016-11-28 12:38:30.861132",
[docker-rancher-client1][DEBUG ]     "epoch": 1,
[docker-rancher-client1][DEBUG ]     "fsid": "ef81681c-ee15-412e-a752-2c3e87b9e369",
[docker-rancher-client1][DEBUG ]     "modified": "2016-11-28 12:38:30.861132",
[docker-rancher-client1][DEBUG ]     "mons": [
[docker-rancher-client1][DEBUG ]       {
[docker-rancher-client1][DEBUG ]         "addr": "10.142.246.2:6789/0",
[docker-rancher-client1][DEBUG ]         "name": "docker-rancher-server",
[docker-rancher-client1][DEBUG ]         "rank": 0
[docker-rancher-client1][DEBUG ]       },
[docker-rancher-client1][DEBUG ]       {
[docker-rancher-client1][DEBUG ]         "addr": "10.142.246.3:6789/0",
[docker-rancher-client1][DEBUG ]         "name": "docker-rancher-client1",
[docker-rancher-client1][DEBUG ]         "rank": 1
[docker-rancher-client1][DEBUG ]       }
[docker-rancher-client1][DEBUG ]     ]
[docker-rancher-client1][DEBUG ]   },
[docker-rancher-client1][DEBUG ]   "name": "docker-rancher-client1",
[docker-rancher-client1][DEBUG ]   "outside_quorum": [
[docker-rancher-client1][DEBUG ]     "docker-rancher-client1"
[docker-rancher-client1][DEBUG ]   ],
[docker-rancher-client1][DEBUG ]   "quorum": [],
[docker-rancher-client1][DEBUG ]   "rank": 1,
[docker-rancher-client1][DEBUG ]   "state": "probing",
[docker-rancher-client1][DEBUG ]   "sync_provider": []
[docker-rancher-client1][DEBUG ] }
[docker-rancher-client1][DEBUG ] ********************************************************************************
[docker-rancher-client1][INFO  ] monitor: mon.docker-rancher-client1 is running
[docker-rancher-client1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.docker-rancher-client1.asok mon_status
[ceph_deploy.mon][INFO  ] processing monitor mon.docker-rancher-server
[docker-rancher-server][DEBUG ] connection detected need for sudo
[docker-rancher-server][DEBUG ] connected to host: docker-rancher-server
[docker-rancher-server][DEBUG ] detect platform information from remote host
[docker-rancher-server][DEBUG ] detect machine type
[docker-rancher-server][DEBUG ] find the location of an executable
[docker-rancher-server][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.docker-rancher-server.asok mon_status
[ceph_deploy.mon][INFO  ] mon.docker-rancher-server monitor has reached quorum!
[ceph_deploy.mon][INFO  ] processing monitor mon.docker-rancher-client1
[docker-rancher-client1][DEBUG ] connection detected need for sudo
[docker-rancher-client1][DEBUG ] connected to host: docker-rancher-client1
[docker-rancher-client1][DEBUG ] detect platform information from remote host
[docker-rancher-client1][DEBUG ] detect machine type
[docker-rancher-client1][DEBUG ] find the location of an executable
[docker-rancher-client1][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.docker-rancher-client1.asok mon_status
[ceph_deploy.mon][INFO  ] mon.docker-rancher-client1 monitor has reached quorum!
[ceph_deploy.mon][INFO  ] all initial monitors are running and have formed quorum
[ceph_deploy.mon][INFO  ] Running gatherkeys...
[ceph_deploy.gatherkeys][INFO  ] Storing keys in temp directory /tmp/tmpCsnUv3
[docker-rancher-server][DEBUG ] connection detected need for sudo
[docker-rancher-server][DEBUG ] connected to host: docker-rancher-server
[docker-rancher-server][DEBUG ] detect platform information from remote host
[docker-rancher-server][DEBUG ] detect machine type
[docker-rancher-server][DEBUG ] get remote short hostname
[docker-rancher-server][DEBUG ] fetch remote file
[docker-rancher-server][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --admin-daemon=/var/run/ceph/ceph-mon.docker-rancher-server.asok mon_status
[docker-rancher-server][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-docker-rancher-server/keyring auth get client.admin
[docker-rancher-server][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-docker-rancher-server/keyring auth get client.bootstrap-mds
[docker-rancher-server][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-docker-rancher-server/keyring auth get client.bootstrap-osd
[docker-rancher-server][INFO  ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-docker-rancher-server/keyring auth get client.bootstrap-rgw
[ceph_deploy.gatherkeys][INFO  ] Storing ceph.client.admin.keyring
[ceph_deploy.gatherkeys][INFO  ] Storing ceph.bootstrap-mds.keyring
[ceph_deploy.gatherkeys][INFO  ] keyring 'ceph.mon.keyring' already exists
[ceph_deploy.gatherkeys][INFO  ] Storing ceph.bootstrap-osd.keyring
[ceph_deploy.gatherkeys][INFO  ] Storing ceph.bootstrap-rgw.keyring
[ceph_deploy.gatherkeys][INFO  ] Destroy temp directory /tmp/tmpCsnUv3

验证,应该产生下面几个文件

[op@docker-rancher-server ceph]$ ll
总用量 472
-rw------- 1 op op    113 11月 28 12:38 ceph.bootstrap-mds.keyring
-rw------- 1 op op    113 11月 28 12:38 ceph.bootstrap-osd.keyring
-rw------- 1 op op    113 11月 28 12:38 ceph.bootstrap-rgw.keyring
-rw------- 1 op op    129 11月 28 12:38 ceph.client.admin.keyring
-rw-rw-r-- 1 op op    302 11月 25 15:32 ceph.conf
-rw-rw-r-- 1 op op 422039 11月 28 12:38 ceph-deploy-ceph.log
-rw------- 1 op op     73 11月 25 15:31 ceph.mon.keyring

3. 增加OSD

由于测试环境的特殊性,本次安装暂时把一个目录(挂载的数据盘)作为OSD目录,==未来生产环境要用磁盘来做。==

准备工作

# 在所有节点创建目录
[op@docker-rancher-server data]$  sudo mkdir -p /data/ceph
# 更改权限,否则会报错
[op@docker-rancher-server data]$  sudo chown -R ceph:ceph /data/ceph

增加OSD

[op@docker-rancher-server ceph]$ ceph-deploy osd prepare  docker-rancher-server:/data/ceph docker-rancher-client1:/data/ceph docker-rancher-client2:/data/ceph hub.chinatelecom.cn:/data/ceph
[ceph_deploy.conf][DEBUG ] found configuration file at: /usr/op/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.36): /usr/bin/ceph-deploy osd prepare docker-rancher-server:/data/ceph docker-rancher-client1:/data/ceph docker-rancher-client2:/data/ceph hub.chinatelecom.cn:/data/ceph
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  disk                          : [('docker-rancher-server', '/data/ceph', None), ('docker-rancher-client1', '/data/ceph', None), ('docker-rancher-client2', '/data/ceph', None), ('hub.chinatelecom.cn', '/data/ceph', None)]
[ceph_deploy.cli][INFO  ]  dmcrypt                       : False
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  bluestore                     : None
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : prepare
[ceph_deploy.cli][INFO  ]  dmcrypt_key_dir               : /etc/ceph/dmcrypt-keys
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x15e97a0>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  fs_type                       : xfs
[ceph_deploy.cli][INFO  ]  func                          : <function osd at 0x15dba28>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  zap_disk                      : False
[ceph_deploy.osd][DEBUG ] Preparing cluster ceph disks docker-rancher-server:/data/ceph: docker-rancher-client1:/data/ceph: docker-rancher-client2:/data/ceph: hub.chinatelecom.cn:/data/ceph:
[docker-rancher-server][DEBUG ] connection detected need for sudo
[docker-rancher-server][DEBUG ] connected to host: docker-rancher-server
[docker-rancher-server][DEBUG ] detect platform information from remote host
[docker-rancher-server][DEBUG ] detect machine type
[docker-rancher-server][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.1.1503 Core
[ceph_deploy.osd][DEBUG ] Deploying osd to docker-rancher-server
[docker-rancher-server][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.osd][DEBUG ] Preparing host docker-rancher-server disk /data/ceph journal None activate False
[docker-rancher-server][DEBUG ] find the location of an executable
[docker-rancher-server][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v prepare --cluster ceph --fs-type xfs -- /data/ceph
[docker-rancher-server][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
[docker-rancher-server][WARNIN] command: Running command: /usr/bin/ceph-osd --check-allows-journal -i 0 --cluster ceph
[docker-rancher-server][WARNIN] command: Running command: /usr/bin/ceph-osd --check-wants-journal -i 0 --cluster ceph
[docker-rancher-server][WARNIN] command: Running command: /usr/bin/ceph-osd --check-needs-journal -i 0 --cluster ceph
[docker-rancher-server][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size
[docker-rancher-server][WARNIN] populate_data_path: Preparing osd data dir /data/ceph
[docker-rancher-server][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/ceph_fsid.68575.tmp
[docker-rancher-server][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/ceph_fsid.68575.tmp
[docker-rancher-server][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/fsid.68575.tmp
[docker-rancher-server][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/fsid.68575.tmp
[docker-rancher-server][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/magic.68575.tmp
[docker-rancher-server][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/magic.68575.tmp
[docker-rancher-server][INFO  ] checking OSD status...
[docker-rancher-server][DEBUG ] find the location of an executable
[docker-rancher-server][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host docker-rancher-server is now ready for osd use.
[docker-rancher-client1][DEBUG ] connection detected need for sudo
[docker-rancher-client1][DEBUG ] connected to host: docker-rancher-client1
[docker-rancher-client1][DEBUG ] detect platform information from remote host
[docker-rancher-client1][DEBUG ] detect machine type
[docker-rancher-client1][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.1.1503 Core
[ceph_deploy.osd][DEBUG ] Deploying osd to docker-rancher-client1
[docker-rancher-client1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.osd][DEBUG ] Preparing host docker-rancher-client1 disk /data/ceph journal None activate False
[docker-rancher-client1][DEBUG ] find the location of an executable
[docker-rancher-client1][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v prepare --cluster ceph --fs-type xfs -- /data/ceph
[docker-rancher-client1][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
[docker-rancher-client1][WARNIN] command: Running command: /usr/bin/ceph-osd --check-allows-journal -i 0 --cluster ceph
[docker-rancher-client1][WARNIN] command: Running command: /usr/bin/ceph-osd --check-wants-journal -i 0 --cluster ceph
[docker-rancher-client1][WARNIN] command: Running command: /usr/bin/ceph-osd --check-needs-journal -i 0 --cluster ceph
[docker-rancher-client1][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size
[docker-rancher-client1][WARNIN] populate_data_path: Preparing osd data dir /data/ceph
[docker-rancher-client1][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/ceph_fsid.31263.tmp
[docker-rancher-client1][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/ceph_fsid.31263.tmp
[docker-rancher-client1][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/fsid.31263.tmp
[docker-rancher-client1][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/fsid.31263.tmp
[docker-rancher-client1][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/magic.31263.tmp
[docker-rancher-client1][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/magic.31263.tmp
[docker-rancher-client1][INFO  ] checking OSD status...
[docker-rancher-client1][DEBUG ] find the location of an executable
[docker-rancher-client1][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host docker-rancher-client1 is now ready for osd use.
[docker-rancher-client2][DEBUG ] connection detected need for sudo
[docker-rancher-client2][DEBUG ] connected to host: docker-rancher-client2
[docker-rancher-client2][DEBUG ] detect platform information from remote host
[docker-rancher-client2][DEBUG ] detect machine type
[docker-rancher-client2][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.1.1503 Core
[ceph_deploy.osd][DEBUG ] Deploying osd to docker-rancher-client2
[docker-rancher-client2][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[docker-rancher-client2][WARNIN] osd keyring does not exist yet, creating one
[docker-rancher-client2][DEBUG ] create a keyring file
[ceph_deploy.osd][DEBUG ] Preparing host docker-rancher-client2 disk /data/ceph journal None activate False
[docker-rancher-client2][DEBUG ] find the location of an executable
[docker-rancher-client2][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v prepare --cluster ceph --fs-type xfs -- /data/ceph
[docker-rancher-client2][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
[docker-rancher-client2][WARNIN] command: Running command: /usr/bin/ceph-osd --check-allows-journal -i 0 --cluster ceph
[docker-rancher-client2][WARNIN] command: Running command: /usr/bin/ceph-osd --check-wants-journal -i 0 --cluster ceph
[docker-rancher-client2][WARNIN] command: Running command: /usr/bin/ceph-osd --check-needs-journal -i 0 --cluster ceph
[docker-rancher-client2][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size
[docker-rancher-client2][WARNIN] populate_data_path: Preparing osd data dir /data/ceph
[docker-rancher-client2][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/ceph_fsid.101240.tmp
[docker-rancher-client2][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/ceph_fsid.101240.tmp
[docker-rancher-client2][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/fsid.101240.tmp
[docker-rancher-client2][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/fsid.101240.tmp
[docker-rancher-client2][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/magic.101240.tmp
[docker-rancher-client2][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/magic.101240.tmp
[docker-rancher-client2][INFO  ] checking OSD status...
[docker-rancher-client2][DEBUG ] find the location of an executable
[docker-rancher-client2][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host docker-rancher-client2 is now ready for osd use.
[hub.chinatelecom.cn][DEBUG ] connection detected need for sudo
[hub.chinatelecom.cn][DEBUG ] connected to host: hub.chinatelecom.cn
[hub.chinatelecom.cn][DEBUG ] detect platform information from remote host
[hub.chinatelecom.cn][DEBUG ] detect machine type
[hub.chinatelecom.cn][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.1.1503 Core
[ceph_deploy.osd][DEBUG ] Deploying osd to hub.chinatelecom.cn
[hub.chinatelecom.cn][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[hub.chinatelecom.cn][WARNIN] osd keyring does not exist yet, creating one
[hub.chinatelecom.cn][DEBUG ] create a keyring file
[ceph_deploy.osd][DEBUG ] Preparing host hub.chinatelecom.cn disk /data/ceph journal None activate False
[hub.chinatelecom.cn][DEBUG ] find the location of an executable
[hub.chinatelecom.cn][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v prepare --cluster ceph --fs-type xfs -- /data/ceph
[hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
[hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/ceph-osd --check-allows-journal -i 0 --cluster ceph
[hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/ceph-osd --check-wants-journal -i 0 --cluster ceph
[hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/ceph-osd --check-needs-journal -i 0 --cluster ceph
[hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=osd_journal_size
[hub.chinatelecom.cn][WARNIN] populate_data_path: Preparing osd data dir /data/ceph
[hub.chinatelecom.cn][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/ceph_fsid.31875.tmp
[hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/ceph_fsid.31875.tmp
[hub.chinatelecom.cn][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/fsid.31875.tmp
[hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/fsid.31875.tmp
[hub.chinatelecom.cn][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/magic.31875.tmp
[hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/magic.31875.tmp
[hub.chinatelecom.cn][INFO  ] checking OSD status...
[hub.chinatelecom.cn][DEBUG ] find the location of an executable
[hub.chinatelecom.cn][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
[ceph_deploy.osd][DEBUG ] Host hub.chinatelecom.cn is now ready for osd use.

激活OSD

[op@docker-rancher-server ceph]$  ceph-deploy osd activate  docker-rancher-server:/data/ceph docker-rancher-client1:/data/ceph docker-rancher-client2:/data/ceph hub.chinatelecom.cn:/data/ceph
[ceph_deploy.conf][DEBUG ] found configuration file at: /usr/op/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.36): /usr/bin/ceph-deploy osd activate docker-rancher-server:/data/ceph docker-rancher-client1:/data/ceph docker-rancher-client2:/data/ceph hub.chinatelecom.cn:/data/ceph
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  subcommand                    : activate
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x28e87a0>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  func                          : <function osd at 0x28daa28>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  disk                          : [('docker-rancher-server', '/data/ceph', None), ('docker-rancher-client1', '/data/ceph', None), ('docker-rancher-client2', '/data/ceph', None), ('hub.chinatelecom.cn', '/data/ceph', None)]
[ceph_deploy.osd][DEBUG ] Activating cluster ceph disks docker-rancher-server:/data/ceph: docker-rancher-client1:/data/ceph: docker-rancher-client2:/data/ceph: hub.chinatelecom.cn:/data/ceph:
[docker-rancher-server][DEBUG ] connection detected need for sudo
[docker-rancher-server][DEBUG ] connected to host: docker-rancher-server
[docker-rancher-server][DEBUG ] detect platform information from remote host
[docker-rancher-server][DEBUG ] detect machine type
[docker-rancher-server][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.1.1503 Core
[ceph_deploy.osd][DEBUG ] activating host docker-rancher-server disk /data/ceph
[ceph_deploy.osd][DEBUG ] will use init type: systemd
[docker-rancher-server][DEBUG ] find the location of an executable
[docker-rancher-server][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /data/ceph
[docker-rancher-server][WARNIN] main_activate: path = /data/ceph
[docker-rancher-server][WARNIN] activate: Cluster uuid is ef81681c-ee15-412e-a752-2c3e87b9e369
[docker-rancher-server][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
[docker-rancher-server][WARNIN] activate: Cluster name is ceph
[docker-rancher-server][WARNIN] activate: OSD uuid is ad4397de-63cf-4d7d-84ce-947450b4780d
[docker-rancher-server][WARNIN] allocate_osd_id: Allocating OSD id...
[docker-rancher-server][WARNIN] command: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd create --concise ad4397de-63cf-4d7d-84ce-947450b4780d
[docker-rancher-server][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/whoami.69785.tmp
[docker-rancher-server][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/whoami.69785.tmp
[docker-rancher-server][WARNIN] activate: OSD id is 0
[docker-rancher-server][WARNIN] activate: Initializing OSD...
[docker-rancher-server][WARNIN] command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /data/ceph/activate.monmap
[docker-rancher-server][WARNIN] got monmap epoch 1
[docker-rancher-server][WARNIN] command: Running command: /usr/bin/timeout 300 ceph-osd --cluster ceph --mkfs --mkkey -i 0 --monmap /data/ceph/activate.monmap --osd-data /data/ceph --osd-journal /data/ceph/journal --osd-uuid ad4397de-63cf-4d7d-84ce-947450b4780d --keyring /data/ceph/keyring --setuser ceph --setgroup ceph
[docker-rancher-server][WARNIN] activate: Marking with init system systemd
[docker-rancher-server][WARNIN] activate: Authorizing OSD key...
[docker-rancher-server][WARNIN] command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring auth add osd.0 -i /data/ceph/keyring osd allow * mon allow profile osd
[docker-rancher-server][WARNIN] added key for osd.0
[docker-rancher-server][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/active.69785.tmp
[docker-rancher-server][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/active.69785.tmp
[docker-rancher-server][WARNIN] activate: ceph osd.0 data dir is ready at /data/ceph
[docker-rancher-server][WARNIN] activate_dir: Creating symlink /var/lib/ceph/osd/ceph-0 -> /data/ceph
[docker-rancher-server][WARNIN] start_daemon: Starting ceph osd.0...
[docker-rancher-server][WARNIN] command_check_call: Running command: /usr/bin/systemctl enable ceph-osd@0
[docker-rancher-server][WARNIN] Created symlink from /etc/systemd/system/ceph-osd.target.wants/ceph-osd@0.service to /usr/lib/systemd/system/ceph-osd@.service.
[docker-rancher-server][WARNIN] command_check_call: Running command: /usr/bin/systemctl start ceph-osd@0
[docker-rancher-server][INFO  ] checking OSD status...
[docker-rancher-server][DEBUG ] find the location of an executable
[docker-rancher-server][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
[docker-rancher-server][WARNIN] there is 1 OSD down
[docker-rancher-server][WARNIN] there is 1 OSD out
[docker-rancher-server][INFO  ] Running command: sudo systemctl enable ceph.target
[docker-rancher-client1][DEBUG ] connection detected need for sudo
[docker-rancher-client1][DEBUG ] connected to host: docker-rancher-client1
[docker-rancher-client1][DEBUG ] detect platform information from remote host
[docker-rancher-client1][DEBUG ] detect machine type
[docker-rancher-client1][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.1.1503 Core
[ceph_deploy.osd][DEBUG ] activating host docker-rancher-client1 disk /data/ceph
[ceph_deploy.osd][DEBUG ] will use init type: systemd
[docker-rancher-client1][DEBUG ] find the location of an executable
[docker-rancher-client1][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /data/ceph
[docker-rancher-client1][WARNIN] main_activate: path = /data/ceph
[docker-rancher-client1][WARNIN] activate: Cluster uuid is ef81681c-ee15-412e-a752-2c3e87b9e369
[docker-rancher-client1][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
[docker-rancher-client1][WARNIN] activate: Cluster name is ceph
[docker-rancher-client1][WARNIN] activate: OSD uuid is d7160d58-ff8d-4779-a1b6-cb3a1f645c96
[docker-rancher-client1][WARNIN] allocate_osd_id: Allocating OSD id...
[docker-rancher-client1][WARNIN] command: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd create --concise d7160d58-ff8d-4779-a1b6-cb3a1f645c96
[docker-rancher-client1][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/whoami.32435.tmp
[docker-rancher-client1][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/whoami.32435.tmp
[docker-rancher-client1][WARNIN] activate: OSD id is 1
[docker-rancher-client1][WARNIN] activate: Initializing OSD...
[docker-rancher-client1][WARNIN] command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /data/ceph/activate.monmap
[docker-rancher-client1][WARNIN] got monmap epoch 1
[docker-rancher-client1][WARNIN] command: Running command: /usr/bin/timeout 300 ceph-osd --cluster ceph --mkfs --mkkey -i 1 --monmap /data/ceph/activate.monmap --osd-data /data/ceph --osd-journal /data/ceph/journal --osd-uuid d7160d58-ff8d-4779-a1b6-cb3a1f645c96 --keyring /data/ceph/keyring --setuser ceph --setgroup ceph
[docker-rancher-client1][WARNIN] activate: Marking with init system systemd
[docker-rancher-client1][WARNIN] activate: Authorizing OSD key...
[docker-rancher-client1][WARNIN] command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring auth add osd.1 -i /data/ceph/keyring osd allow * mon allow profile osd
[docker-rancher-client1][WARNIN] added key for osd.1
[docker-rancher-client1][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/active.32435.tmp
[docker-rancher-client1][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/active.32435.tmp
[docker-rancher-client1][WARNIN] activate: ceph osd.1 data dir is ready at /data/ceph
[docker-rancher-client1][WARNIN] activate_dir: Creating symlink /var/lib/ceph/osd/ceph-1 -> /data/ceph
[docker-rancher-client1][WARNIN] start_daemon: Starting ceph osd.1...
[docker-rancher-client1][WARNIN] command_check_call: Running command: /usr/bin/systemctl enable ceph-osd@1
[docker-rancher-client1][WARNIN] Created symlink from /etc/systemd/system/ceph-osd.target.wants/ceph-osd@1.service to /usr/lib/systemd/system/ceph-osd@.service.
[docker-rancher-client1][WARNIN] command_check_call: Running command: /usr/bin/systemctl start ceph-osd@1
[docker-rancher-client1][INFO  ] checking OSD status...
[docker-rancher-client1][DEBUG ] find the location of an executable
[docker-rancher-client1][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
[docker-rancher-client1][WARNIN] there are 2 OSDs down
[docker-rancher-client1][WARNIN] there are 2 OSDs out
[docker-rancher-client1][INFO  ] Running command: sudo systemctl enable ceph.target
[docker-rancher-client2][DEBUG ] connection detected need for sudo
[docker-rancher-client2][DEBUG ] connected to host: docker-rancher-client2
[docker-rancher-client2][DEBUG ] detect platform information from remote host
[docker-rancher-client2][DEBUG ] detect machine type
[docker-rancher-client2][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.1.1503 Core
[ceph_deploy.osd][DEBUG ] activating host docker-rancher-client2 disk /data/ceph
[ceph_deploy.osd][DEBUG ] will use init type: systemd
[docker-rancher-client2][DEBUG ] find the location of an executable
[docker-rancher-client2][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /data/ceph
[docker-rancher-client2][WARNIN] main_activate: path = /data/ceph
[docker-rancher-client2][WARNIN] activate: Cluster uuid is ef81681c-ee15-412e-a752-2c3e87b9e369
[docker-rancher-client2][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
[docker-rancher-client2][WARNIN] activate: Cluster name is ceph
[docker-rancher-client2][WARNIN] activate: OSD uuid is 68cf33ef-d805-41df-8683-cd6ff94c8f18
[docker-rancher-client2][WARNIN] allocate_osd_id: Allocating OSD id...
[docker-rancher-client2][WARNIN] command: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd create --concise 68cf33ef-d805-41df-8683-cd6ff94c8f18
[docker-rancher-client2][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/whoami.102520.tmp
[docker-rancher-client2][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/whoami.102520.tmp
[docker-rancher-client2][WARNIN] activate: OSD id is 2
[docker-rancher-client2][WARNIN] activate: Initializing OSD...
[docker-rancher-client2][WARNIN] command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /data/ceph/activate.monmap
[docker-rancher-client2][WARNIN] got monmap epoch 1
[docker-rancher-client2][WARNIN] command: Running command: /usr/bin/timeout 300 ceph-osd --cluster ceph --mkfs --mkkey -i 2 --monmap /data/ceph/activate.monmap --osd-data /data/ceph --osd-journal /data/ceph/journal --osd-uuid 68cf33ef-d805-41df-8683-cd6ff94c8f18 --keyring /data/ceph/keyring --setuser ceph --setgroup ceph
[docker-rancher-client2][WARNIN] activate: Marking with init system systemd
[docker-rancher-client2][WARNIN] activate: Authorizing OSD key...
[docker-rancher-client2][WARNIN] command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring auth add osd.2 -i /data/ceph/keyring osd allow * mon allow profile osd
[docker-rancher-client2][WARNIN] added key for osd.2
[docker-rancher-client2][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/active.102520.tmp
[docker-rancher-client2][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/active.102520.tmp
[docker-rancher-client2][WARNIN] activate: ceph osd.2 data dir is ready at /data/ceph
[docker-rancher-client2][WARNIN] activate_dir: Creating symlink /var/lib/ceph/osd/ceph-2 -> /data/ceph
[docker-rancher-client2][WARNIN] start_daemon: Starting ceph osd.2...
[docker-rancher-client2][WARNIN] command_check_call: Running command: /usr/bin/systemctl enable ceph-osd@2
[docker-rancher-client2][WARNIN] Created symlink from /etc/systemd/system/ceph-osd.target.wants/ceph-osd@2.service to /usr/lib/systemd/system/ceph-osd@.service.
[docker-rancher-client2][WARNIN] command_check_call: Running command: /usr/bin/systemctl start ceph-osd@2
[docker-rancher-client2][INFO  ] checking OSD status...
[docker-rancher-client2][DEBUG ] find the location of an executable
[docker-rancher-client2][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
[docker-rancher-client2][INFO  ] Running command: sudo systemctl enable ceph.target
[hub.chinatelecom.cn][DEBUG ] connection detected need for sudo
[hub.chinatelecom.cn][DEBUG ] connected to host: hub.chinatelecom.cn
[hub.chinatelecom.cn][DEBUG ] detect platform information from remote host
[hub.chinatelecom.cn][DEBUG ] detect machine type
[hub.chinatelecom.cn][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.1.1503 Core
[ceph_deploy.osd][DEBUG ] activating host hub.chinatelecom.cn disk /data/ceph
[ceph_deploy.osd][DEBUG ] will use init type: systemd
[hub.chinatelecom.cn][DEBUG ] find the location of an executable
[hub.chinatelecom.cn][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v activate --mark-init systemd --mount /data/ceph
[hub.chinatelecom.cn][WARNIN] main_activate: path = /data/ceph
[hub.chinatelecom.cn][WARNIN] activate: Cluster uuid is ef81681c-ee15-412e-a752-2c3e87b9e369
[hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
[hub.chinatelecom.cn][WARNIN] activate: Cluster name is ceph
[hub.chinatelecom.cn][WARNIN] activate: OSD uuid is 5cba26ef-de7d-4ddc-8fcd-f800a84b8255
[hub.chinatelecom.cn][WARNIN] allocate_osd_id: Allocating OSD id...
[hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd create --concise 5cba26ef-de7d-4ddc-8fcd-f800a84b8255
[hub.chinatelecom.cn][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/whoami.33404.tmp
[hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/whoami.33404.tmp
[hub.chinatelecom.cn][WARNIN] activate: OSD id is 3
[hub.chinatelecom.cn][WARNIN] activate: Initializing OSD...
[hub.chinatelecom.cn][WARNIN] command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /data/ceph/activate.monmap
[hub.chinatelecom.cn][WARNIN] got monmap epoch 1
[hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/timeout 300 ceph-osd --cluster ceph --mkfs --mkkey -i 3 --monmap /data/ceph/activate.monmap --osd-data /data/ceph --osd-journal /data/ceph/journal --osd-uuid 5cba26ef-de7d-4ddc-8fcd-f800a84b8255 --keyring /data/ceph/keyring --setuser ceph --setgroup ceph
[hub.chinatelecom.cn][WARNIN] activate: Marking with init system systemd
[hub.chinatelecom.cn][WARNIN] activate: Authorizing OSD key...
[hub.chinatelecom.cn][WARNIN] command_check_call: Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring auth add osd.3 -i /data/ceph/keyring osd allow * mon allow profile osd
[hub.chinatelecom.cn][WARNIN] added key for osd.3
[hub.chinatelecom.cn][WARNIN] command: Running command: /sbin/restorecon -R /data/ceph/active.33404.tmp
[hub.chinatelecom.cn][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph /data/ceph/active.33404.tmp
[hub.chinatelecom.cn][WARNIN] activate: ceph osd.3 data dir is ready at /data/ceph
[hub.chinatelecom.cn][WARNIN] activate_dir: Creating symlink /var/lib/ceph/osd/ceph-3 -> /data/ceph
[hub.chinatelecom.cn][WARNIN] start_daemon: Starting ceph osd.3...
[hub.chinatelecom.cn][WARNIN] command_check_call: Running command: /usr/bin/systemctl enable ceph-osd@3
[hub.chinatelecom.cn][WARNIN] Created symlink from /etc/systemd/system/ceph-osd.target.wants/ceph-osd@3.service to /usr/lib/systemd/system/ceph-osd@.service.
[hub.chinatelecom.cn][WARNIN] command_check_call: Running command: /usr/bin/systemctl start ceph-osd@3
[hub.chinatelecom.cn][INFO  ] checking OSD status...
[hub.chinatelecom.cn][DEBUG ] find the location of an executable
[hub.chinatelecom.cn][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat --format=json
[hub.chinatelecom.cn][INFO  ] Running command: sudo systemctl enable ceph.target

拷贝密钥

# 用 ceph-deploy 把配置文件和 admin 密钥拷贝到管理节点和 Ceph 节点,这样你每次执行 Ceph 命令行时就无需指定 monitor 地址和 ceph.client.admin.keyring 了
[op@docker-rancher-server ceph]$ ceph-deploy admin docker-rancher-server docker-rancher-client1
[ceph_deploy.conf][DEBUG ] found configuration file at: /usr/op/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.36): /usr/bin/ceph-deploy admin docker-rancher-server docker-rancher-client1
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x26724d0>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  client                        : ['docker-rancher-server', 'docker-rancher-client1']
[ceph_deploy.cli][INFO  ]  func                          : <function admin at 0x7faecb068050>
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to docker-rancher-server
[docker-rancher-server][DEBUG ] connection detected need for sudo
[docker-rancher-server][DEBUG ] connected to host: docker-rancher-server
[docker-rancher-server][DEBUG ] detect platform information from remote host
[docker-rancher-server][DEBUG ] detect machine type
[docker-rancher-server][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph_deploy.admin][DEBUG ] Pushing admin keys and conf to docker-rancher-client1
[docker-rancher-client1][DEBUG ] connection detected need for sudo
[docker-rancher-client1][DEBUG ] connected to host: docker-rancher-client1
[docker-rancher-client1][DEBUG ] detect platform information from remote host
[docker-rancher-client1][DEBUG ] detect machine type
[docker-rancher-client1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf

确保你对 ceph.client.admin.keyring 有正确的操作权限

# 只在admin主机执行即可
sudo chmod +r /etc/ceph/ceph.client.admin.keyring

检查集群状态

[op@docker-rancher-server ceph]$ ceph health
HEALTH_ERR 64 pgs are stuck inactive for more than 300 seconds; 64 pgs stuck inactive
# 出现这个问题,常见报错解决方案3

正常的情况如下:

[op@docker-rancher-server ceph]$ ceph health
HEALTH_OK
[op@docker-rancher-server ceph]$ ceph -s
    cluster ef81681c-ee15-412e-a752-2c3e87b9e369
     health HEALTH_OK
     monmap e1: 2 mons at {docker-rancher-client1=10.142.246.3:6789/0,docker-rancher-server=10.142.246.2:6789/0}
            election epoch 8, quorum 0,1 docker-rancher-server,docker-rancher-client1
     osdmap e18: 4 osds: 4 up, 4 in
            flags sortbitwise
      pgmap v182: 64 pgs, 1 pools, 0 bytes data, 0 objects
            281 GB used, 3455 GB / 3936 GB avail
                  64 active+clean

[op@docker-rancher-server ceph]$ ceph osd tree
ID WEIGHT  TYPE NAME                       UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 3.84436 root default
-2 0.96109     host docker-rancher-server
 0 0.96109         osd.0                        up  1.00000          1.00000
-3 0.96109     host docker-rancher-client1
 1 0.96109         osd.1                        up  1.00000          1.00000
-4 0.96109     host docker-rancher-client2
 2 0.96109         osd.2                        up  1.00000          1.00000
-5 0.96109     host hub
 3 0.96109         osd.3                        up  1.00000          1.00000 

4. 增加块

默认安装完成后,会有个rbd的存储池。由于本ceph环境当前只用于docker volume后端存储,所以直接用默认的rbd存储池,后期生产环境如果多个系统使用,则构建volume自己的存储池。

1. 查看资源池

以下操作要在admin节点执行

# 列出已存在的存储池
[op@docker-rancher-server ~]$ rados lspools
rbd
[op@docker-rancher-server ~]$ ceph df
GLOBAL:
    SIZE      AVAIL     RAW USED     %RAW USED
    3936G     3455G         281G          7.14
POOLS:
    NAME     ID     USED     %USED     MAX AVAIL     OBJECTS
    rbd      0         0         0         1072G           0
rbd默认有1072G可用
如果后期磁盘空间不够用,可以将size的个数调整
[op@docker-rancher-server ~]$ ceph osd pool set rbd size 2
set pool 0 size to 2
[op@docker-rancher-server ~]$  ceph df
GLOBAL:
    SIZE      AVAIL     RAW USED     %RAW USED
    3936G     3455G         281G          7.14
POOLS:
    NAME     ID     USED     %USED     MAX AVAIL     OBJECTS
    rbd      0         0         0         1608G           0 

创建块,该块设备推荐使用format 2的格式,这样后期可以做镜像和快照。但是问题来了,由于内核版本是3.10,不支持format 2的部分新特性。常见常见错误5,此处改用format 1 默认格式。

2. 创建块设备与映射

# 创建1T
[op@docker-rancher-server ceph]$ rbd create docker-volume --size 1T --pool rbd  --image-format 1
rbd: image format 1 is deprecated
# 此处提示1 已经是废弃的了
# 补充一点命令
[op@docker-rancher-server ceph]$ rbd help create
usage: rbd create [--pool <pool>] [--image <image>]
                  [--image-format <image-format>] [--new-format]
                  [--order <order>] [--object-size <object-size>]
                  [--image-feature <image-feature>] [--image-shared]
                  [--stripe-unit <stripe-unit>]
                  [--stripe-count <stripe-count>]
                  [--journal-splay-width <journal-splay-width>]
                  [--journal-object-size <journal-object-size>]
                  [--journal-pool <journal-pool>] --size <size>
                  <image-spec> 

Create an empty image.

Positional arguments
  <image-spec>              image specification
                            (example: [<pool-name>/]<image-name>)

Optional arguments
  -p [ --pool ] arg         pool name
  --image arg               image name
  --image-format arg        image format [1 (deprecated) or 2]
  --new-format              use image format 2
                            (deprecated)
  --order arg               object order [12 <= order <= 25]
  --object-size arg         object size in B/K/M [4K <= object size <= 32M]
  --image-feature arg       image features
                            [layering(+), striping, exclusive-lock(+*),
                            object-map(+*), fast-diff(+*), deep-flatten(+-),
                            journaling(*)]
  --image-shared            shared image
  --stripe-unit arg         stripe unit
  --stripe-count arg        stripe count
  --journal-splay-width arg number of active journal objects
  --journal-object-size arg size of journal objects
  --journal-pool arg        pool for journal objects
  -s [ --size ] arg         image size (in M/G/T)

Image Features:
  (*) supports enabling/disabling on existing images
  (-) supports disabling-only on existing images
  (+) enabled by default for new images if features not specified

# 验证
[op@docker-rancher-server ceph]$ rbd ls
docker-volume
# 查看详情
[op@docker-rancher-server ceph]$ rbd info docker-volume
rbd image 'docker-volume':
    size 1024 GB in 262144 objects
    order 22 (4096 kB objects)
    block_name_prefix: rb.0.acb6.2ae8944a
    format: 1

# 此步骤备选;后期如果磁盘不够,可以用以扩展。resize可大,也可小。根据需求定。
rbd resizs docker-volume --size 更改的值  

# 映射块设备,注意此处要用sudo来做,否则报错,无法写入。
[op@docker-rancher-server ceph]$ sudo rbd map docker-volume --pool rbd --id admin
/dev/rbd0

# 验证映射
[op@docker-rancher-server ceph]$ rbd showmapped
id pool image         snap device
0  rbd  docker-volume -    /dev/rbd0 

# 备注,取消映射的方法:
rbd unmap /dev/rbd/{pool-nmae}/{image-name}
rbd unmap /dev/rbd/rbd/docker-volume
# 或者使用
rbd unmap /dev/rbd0

3. 使用块设备

# 首先格式化块
[op@docker-rancher-server ceph]$ sudo mkfs.ext4 -q /dev/rbd0

# 建立Linux挂载目录
[root@docker-rancher-server ~]# sudo mkdir /ceph-rbd

# 挂载
[op@docker-rancher-server ceph]$ sudo mount -t /dev/rbd0 /ceph-rbd

# 配置开机自动挂载
1. 修改ceph自动挂载
[op@docker-rancher-server ~]$ sudo vim /etc/ceph/rbdmap
# RbdDevice             Parameters
#poolname/imagename     id=client,keyring=/etc/ceph/ceph.client.keyring
rbd/docker-volume       id=admin,keyring=/etc/ceph/ceph.client.admin.keyring

2. 修改fstab开机自启动,增加挂载项
vim /etc/fstab
/dev/rbd0                   /ceph-rbd   ext4  defaults        1 2

# 验证
Filesystem                  Size  Used Avail Use% Mounted on
/dev/mapper/centos-root      45G  2.9G   42G   7% /
devtmpfs                     16G     0   16G   0% /dev
tmpfs                        16G     0   16G   0% /dev/shm
tmpfs                        16G  1.5G   15G  10% /run
tmpfs                        16G     0   16G   0% /sys/fs/cgroup
/dev/mapper/datavg-lv_data  985G   62G  873G   7% /data
/dev/xvda1                  497M  135M  362M  28% /boot
10.142.246.2:/data/nfs      985G   62G  873G   7% /var/lib/rancher/convoy/convoy-nfs-85120bf6-2d8d-44e1-b868-bde8284a3b4c/mnt
tmpfs                       3.1G     0  3.1G   0% /run/user/0
/dev/rbd0                  1008G   77M  957G   1% /ceph-rbd

# 现在可以去ceph-rbd里面创建个文件或者其他东西

4. 未来删除块设备的流程

# 1. 取消挂载
umount /ceph-rbd
# 2. 先去fstab和rbdmap里面删除增加的开机自动挂载信息,否则下次开机无法启动。
# 注意,一定要将ceph设置为开机自动启动,否则也是无法开机

# 3. 取消映射
rbd unmap /dev/rbd0
# 验证
rbd showmapped
# 4. 删除对应的快
rbd rm docker-volume

附录

1. 增加mon节点

上述配置过程中只配置了两个mon,生产环境应该配置3个或以上的mon节点

# 编辑配置文件,增加mon
[op@docker-rancher-server ceph]$ vim ceph.conf
mon_initial_members = docker-rancher-server, docker-rancher-client1, docker-rancher-client2
mon_host = 10.142.246.2,10.142.246.3,10.142.246.4

# 增加mon
[op@docker-rancher-server ceph]$ ceph-deploy --overwrite-conf   mon  create docker-rancher-client2

# 同步其他节点的配置
[op@docker-rancher-server ceph]$ ceph-deploy  --overwrite-conf config push docker-rancher-server
[op@docker-rancher-server ceph]$ ceph-deploy  --overwrite-conf config push docker-rancher-client1
# 检查结果
[op@docker-rancher-server ceph]$ ceph -s
    cluster ef81681c-ee15-412e-a752-2c3e87b9e369
     health HEALTH_OK
     monmap e2: 3 mons at {docker-rancher-client1=10.142.246.3:6789/0,docker-rancher-client2=10.142.246.4:6789/0,docker-rancher-server=10.142.246.2:6789/0}
            election epoch 10, quorum 0,1,2 docker-rancher-server,docker-rancher-client1,docker-rancher-client2
     osdmap e28: 4 osds: 4 up, 4 in
            flags sortbitwise
      pgmap v10836: 64 pgs, 1 pools, 0 bytes data, 0 objects
            281 GB used, 3454 GB / 3936 GB avail
                  64 active+clean

可以看到已经有3个mon了

2. 删除mon节点

# 先去修改ceph.conf文件,删除对应的mon

# 再去推送到其他mon节点

# 执行删除
ceph-deploy mon destroy  $HOSTNAME

# 检测
ceph -s

2. 增加删除osd

1. 增加osd

# 增加的过程和之前部署的一样
# 此处以文件夹为代表演示
# 1. 先创建对应文件夹,将权限更改为ceph
# 2. ceph-deploy osd prepare $hostname:目录
# 3.  ceph-deploy osd activate $hostname:目录
# 4. 使用ceph -s和ceph osd tree查看

2. 删除osd

  1. 停进程
# 检查当前osd
ceph osd tree
  1. 其他具体删除时查看官网

测试块读写速度

1. 测试Linux磁盘读写速度

# 测试写速度
[root@docker-rancher-server ~]# time dd if=/dev/zero of=/test.dbf bs=8k count=300000
300000+0 records in
300000+0 records out
2457600000 bytes (2.5 GB) copied, 2.60019 s, 945 MB/s

real    0m2.602s
user    0m0.054s
sys 0m2.542s
# 测试读速度
[root@docker-rancher-server ~]# time dd if=/test.dbf of=/dev/null bs=8k count=300000
300000+0 records in
300000+0 records out
2457600000 bytes (2.5 GB) copied, 0.804974 s, 3.1 GB/s

real    0m0.806s
user    0m0.028s
sys 0m0.778s

2. 测试挂载的/data数据盘读写速度

# 测试写速度
[root@docker-rancher-server ceph-rbd]# time dd if=/dev/zero of=/data/test.dbf bs=8k count=300000
300000+0 records in
300000+0 records out
2457600000 bytes (2.5 GB) copied, 3.38757 s, 725 MB/s

real    0m3.407s
user    0m0.053s
sys 0m3.248s
# 测试读速度
[root@docker-rancher-server ~]# time dd if=/data/test.dbf of=/dev/null bs=8k count=300000
300000+0 records in
300000+0 records out
2457600000 bytes (2.5 GB) copied, 0.899513 s, 2.7 GB/s

real    0m0.901s
user    0m0.029s
sys 0m0.872s

3. 测试ceph 块存储读写速度

# 测试写速度
[root@docker-rancher-server ceph-rbd]# time dd if=/dev/zero of=/ceph-rbd/test.dbf bs=8k count=300000
300000+0 records in
300000+0 records out
2457600000 bytes (2.5 GB) copied, 3.31538 s, 741 MB/s

real    0m3.335s
user    0m0.060s
sys 0m3.253s0
# 测试读速度
[root@docker-rancher-server ceph-rbd]# time dd if=/ceph-rbd/test.dbf of=/dev/null bs=8k count=300000
300000+0 records in
300000+0 records out
2457600000 bytes (2.5 GB) copied, 0.963309 s, 2.6 GB/s

real    0m0.965s
user    0m0.024s
sys 0m0.938s

由于测试环境等各种原因,并不能很全面反映ceph 快存储的读写速度,大概可以看出,和华为云平台挂载的数据盘读写速度差不多。原因可能是本次ceph用的就是华为云平台挂载的数据盘。后期具体生产环境可以再次测试一下。不过可以大概了解到网络方面对读写的影响不是很大。在本次实验中并不是一个影响很大的因素。

常见错误

1. 执行 ceph-deploy new 创建监视器时报错

# 报错代码
[op@docker-rancher-server ceph]$ ceph-deploy new docker-rancher-server docker-rancher-client1
Traceback (most recent call last):
  File "/usr/bin/ceph-deploy", line 18, in <module>
    from ceph_deploy.cli import main
ImportError: No module named ceph_deploy.cli

产生原因 : 由于之前升级了CentOS7默认的python版本导致的,解决方法是修改ceph-deploy,使其指向默认python版本

解决方法:

[op@docker-rancher-server ceph]$ sudo vim /usr/bin/ceph-deploy
将#!/usr/bin/env python
修改为#!/usr/bin/python2.7

2. 执行ceph-deploy时报错,无法去ceph官网下包

[op@docker-rancher-server ceph]$ ceph-deploy install docker-rancher-server docker-rancher-client1 docker-rancher-client2 hub.chinatelecom.cn
......
[docker-rancher-server][DEBUG ] 完毕!
[docker-rancher-server][DEBUG ] Configure Yum priorities to include obsoletes
[docker-rancher-server][WARNIN] check_obsoletes has been enabled for Yum priorities plugin
[docker-rancher-server][INFO  ] Running command: sudo rpm --import https://download.ceph.com/keys/release.asc
[docker-rancher-server][WARNIN] curl: (6) Could not resolve host: download.ceph.com; 未知的名称或服务
[docker-rancher-server][WARNIN] 错误:https://download.ceph.com/keys/release.asc: import read failed(2).
[docker-rancher-server][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: rpm --import https://download.ceph.com/keys/release.asc

问题原因:由于内网无法联网,导致无法去官网下包

解决办法:修改ceph-deploy中对应下载地址到自己的ceph源。

修改3个文件

# 1. 修改第一个文件
[op@docker-rancher-server ~]$ cd /usr/lib/python2.7/site-packages/ceph_deploy/hosts/centos/
[op@docker-rancher-server centos]$ sudo vim install.py
 79 #            remoto.process.run(
 80 #                distro.conn,
 81 #                [
 82 #                    'rpm',
 83 #                    '-Uvh',
 84 #                    '--replacepkgs',
 85 #                    '{url}noarch/ceph-release-1-0.{dist}.noarch.rpm'.format(url=url, dist=dist),
 86 #                ],
 87 #           )

 # 2. 修改第二个文件
[op@docker-rancher-server centos]$ cd /usr/lib/python2.7/site-packages/ceph_deploy/util/
[op@docker-rancher-server util]$ sudo vim constants.py
# 修改为自己的keys地址
32 gpg_key_base_url = "10.142.78.40/ceph/keys/"

# 3. 修改第三个文件
[op@docker-rancher-server util]$ cd /usr/lib/python2.7/site-packages/ceph_deploy/util/paths/
[op@docker-rancher-server paths]$ sudo vim gpg.py
# 把https改为http
 3 def url(key_type, protocol="http"):
  4     return "{protocol}://{url}{key_type}.asc".format(
  5         protocol=protocol,
  6         url=constants.gpg_key_base_url,
  7         key_type=key_type
  8     )

# 将三个文件同步到其他各个节点
#! /bin/bash
set -ex
hosts="docker-rancher-server docker-rancher-client1 docker-rancher-client2 hub.chinatelecom.cn"

file="/usr/lib/python2.7/site-packages/ceph_deploy/hosts/centos/install.py /usr/lib/python2.7/site-packages/ceph_deploy/util/constants.py /usr/lib/python2.7/site-packages/ceph_deploy/util/paths/gpg.py"

destinationDirectory="~"

for i in $hosts
do
    scp $file $i:$destinationDirectory
    ssh $i sudo mv $destinationDirectory/install.py /usr/lib/python2.7/site-packages/ceph_deploy/hosts/centos/
    ssh $i sudo mv $destinationDirectory/constants.py  /usr/lib/python2.7/site-packages/ceph_deploy/util/
    ssh $i sudo mv $destinationDirectory/gpg.py  /usr/lib/python2.7/site-packages/ceph_deploy/util/paths/
done

该修改参考资料

本地源安装ceph

[op@docker-rancher-server ceph]$ ceph-deploy install docker-rancher-server docker-rancher-client1 docker-rancher-client2 hub.chinatelecom.cn
......
[docker-rancher-server][DEBUG ] ---> 软件包 spax.x86_64.0.1.5.2-13.el7 将被 安装
[docker-rancher-server][DEBUG ] ---> 软件包 time.x86_64.0.1.7-45.el7 将被 安装
[docker-rancher-server][DEBUG ] --> 解决依赖关系完成
[docker-rancher-server][DEBUG ]  您可以尝试添加 --skip-broken 选项来解决该问题
[docker-rancher-server][WARNIN] 错误:软件包:1:ceph-selinux-10.2.3-0.el7.x86_64 (ceph)
[docker-rancher-server][WARNIN]           需要:selinux-policy-base >= 3.13.1-60.el7_2.7
[docker-rancher-server][WARNIN]           已安装: selinux-policy-targeted-3.13.1-23.el7.noarch (@anaconda)
[docker-rancher-server][WARNIN]               selinux-policy-base = 3.13.1-23.el7
[docker-rancher-server][WARNIN]           可用: selinux-policy-minimum-3.13.1-23.el7.noarch (base)
[docker-rancher-server][WARNIN]               selinux-policy-base = 3.13.1-23.el7
[docker-rancher-server][WARNIN]           可用: selinux-policy-minimum-3.13.1-60.el7.noarch (updates)
[docker-rancher-server][WARNIN]               selinux-policy-base = 3.13.1-60.el7
[docker-rancher-server][WARNIN]           可用: selinux-policy-mls-3.13.1-23.el7.noarch (base)
[docker-rancher-server][WARNIN]               selinux-policy-base = 3.13.1-23.el7
[docker-rancher-server][WARNIN]           可用: selinux-policy-mls-3.13.1-60.el7.noarch (updates)
[docker-rancher-server][WARNIN]               selinux-policy-base = 3.13.1-60.el7
[docker-rancher-server][WARNIN]           可用: selinux-policy-targeted-3.13.1-60.el7.noarch (updates)
[docker-rancher-server][WARNIN]               selinux-policy-base = 3.13.1-60.el7
[docker-rancher-server][WARNIN] 错误:软件包:1:python-flask-0.10.1-3.el7.noarch (epel)
[docker-rancher-server][WARNIN]           需要:python-itsdangerous
[docker-rancher-server][DEBUG ]  您可以尝试执行:rpm -Va --nofiles --nodigest
[docker-rancher-server][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: yum -y install ceph ceph-radosgw

这是yum源里的selinux-policy-targeted版本不够造成的,下载对应的包放进去

selinux-policy-3.13.1-60.el7_2.7.noarch.rpm

selinux-policy-targeted-3.13.1-60.el7_2.7.noarch.rpm

# 在所有的节点都执行
[op@docker-rancher-server ~]$ sudo yum localinstall selinux-policy-3.13.1-60.el7_2.7.noarch.rpm
[op@docker-rancher-server ~]$ sudo yum localinstall selinux-policy-targeted-3.13.1-60.el7_2.7.noarch.rpm

3. ceph健康检查一个osd都没有

[op@docker-rancher-server ceph]$ ceph health
HEALTH_ERR 64 pgs are stuck inactive for more than 300 seconds; 64 pgs stuck inactive
[op@docker-rancher-server ceph]$ ceph -s
    cluster ef81681c-ee15-412e-a752-2c3e87b9e369
     health HEALTH_ERR
            64 pgs are stuck inactive for more than 300 seconds
            64 pgs stuck inactive
     monmap e1: 2 mons at {docker-rancher-client1=10.142.246.3:6789/0,docker-rancher-server=10.142.246.2:6789/0}
            election epoch 8, quorum 0,1 docker-rancher-server,docker-rancher-client1
     osdmap e9: 4 osds: 0 up, 0 in
            flags sortbitwise
      pgmap v10: 64 pgs, 1 pools, 0 bytes data, 0 objects
            0 kB used, 0 kB / 0 kB avail
                  64 creating

经排查错误日志:

[root@docker-rancher-client2 data]# tail -f /var/log/ceph/ceph-osd.2.log
2016-11-28 14:25:03.603389 7f0ccf069800  0 genericfilestorebackend(/var/lib/ceph/osd/ceph-2) detect_features: syncfs(2) syscall fully supported (by glibc and kernel)
2016-11-28 14:25:03.607758 7f0ccf069800  0 filestore(/var/lib/ceph/osd/ceph-2) limited size xattrs
2016-11-28 14:25:03.608339 7f0ccf069800  1 leveldb: Recovering log #16
2016-11-28 14:25:03.613865 7f0ccf069800  1 leveldb: Delete type=0 #16

2016-11-28 14:25:03.613927 7f0ccf069800  1 leveldb: Delete type=3 #15

2016-11-28 14:25:03.614165 7f0ccf069800  0 filestore(/var/lib/ceph/osd/ceph-2) mount: enabling WRITEAHEAD journal mode: checkpoint is not enabled
2016-11-28 14:25:03.614331 7f0ccf069800 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
2016-11-28 14:25:03.614341 7f0ccf069800  1 journal _open /var/lib/ceph/osd/ceph-2/journal fd 18: 5368709120 bytes, block size 4096 bytes, directio = 1, aio = 0
2016-11-28 14:25:03.614665 7f0ccf069800  1 journal _open /var/lib/ceph/osd/ceph-2/journal fd 18: 5368709120 bytes, block size 4096 bytes, directio = 1, aio = 0
2016-11-28 14:25:03.614947 7f0ccf069800  1 filestore(/var/lib/ceph/osd/ceph-2) upgrade
2016-11-28 14:25:03.615114 7f0ccf069800 -1 osd.2 0 backend (filestore) is unable to support max object name[space] len
2016-11-28 14:25:03.615140 7f0ccf069800 -1 osd.2 0    osd max object name len = 2048
2016-11-28 14:25:03.615142 7f0ccf069800 -1 osd.2 0    osd max object namespace len = 256
2016-11-28 14:25:03.615144 7f0ccf069800 -1 osd.2 0 (36) File name too long
2016-11-28 14:25:03.615498 7f0ccf069800  1 journal close /var/lib/ceph/osd/ceph-2/journal
2016-11-28 14:25:03.616473 7f0ccf069800 -1  ** ERROR: osd init failed: (36) File name too long

log意思是说,文件名太长。各种google搜索一番后,发现原来我用的文件系统是ext4,CentOS推荐使用xfs的文件系统。但是磁盘不能重新格式化,所以我就在ceph配置文件中增加参数,限制文件名的长度。

# 注意,四个节点都要做
[op@docker-rancher-client2 ~]$ sudo vim /etc/ceph/ceph.conf
osd max object name len = 256
osd max object namespace len = 64

# 之后重启osd服务
[op@docker-rancher-client2 ~]$ sudo systemctl restart  ceph-osd.target 

参考资料:

The ceph OSD deamon is not activated with ext4 file system

http://tracker.ceph.com/issues/16187

4. rbd create失败

在配置好集群以后,rbd create一直失败,情况如下

[op@docker-rancher-server ceph]$ rbd create docker-volume --size 1024
2016-11-29 12:22:57.485826 7faa2e05c700  0 -- 10.142.246.2:0/3639593179 >> 10.142.246.5:6800/109587 pipe(0x7faa57bb46c0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7faa57bb5980).fault

但是整个集群的状态是好的

[op@docker-rancher-server ceph]$ ceph osd tree
ID WEIGHT  TYPE NAME                       UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 3.84436 root default
-2 0.96109     host docker-rancher-server
 0 0.96109         osd.0                        up  1.00000          1.00000
-3 0.96109     host docker-rancher-client1
 1 0.96109         osd.1                        up  1.00000          1.00000
-4 0.96109     host docker-rancher-client2
 2 0.96109         osd.2                        up  1.00000          1.00000
-5 0.96109     host hub
 3 0.96109         osd.3                        up  1.00000          1.00000
[op@docker-rancher-server ceph]$ ceph -s
    cluster ef81681c-ee15-412e-a752-2c3e87b9e369
     health HEALTH_OK
     monmap e2: 3 mons at {docker-rancher-client1=10.142.246.3:6789/0,docker-rancher-client2=10.142.246.4:6789/0,docker-rancher-server=10.142.246.2:6789/0}
            election epoch 18, quorum 0,1,2 docker-rancher-server,docker-rancher-client1,docker-rancher-client2
     osdmap e82: 4 osds: 4 up, 4 in
            flags sortbitwise
      pgmap v42631: 64 pgs, 1 pools, 0 bytes data, 0 objects
            283 GB used, 3452 GB / 3936 GB avail
                  64 active+clean

后来换了个节点docker-rancher-client1节点,查看osd的日志信息

[root@docker-rancher-client1 ceph]# tail -f ceph-osd.1.log
2016-11-29 12:26:32.770143 7ffbc37c5700 -1 osd.1 82 heartbeat_check: no reply from osd.3 ever on either front or back, first ping sent 2016-11-29 12:21:21.113662 (cutoff 2016-11-29 12:26:12.770139)
2016-11-29 12:26:33.558821 7ffbaadfe700 -1 osd.1 82 heartbeat_check: no reply from osd.3 ever on either front or back, first ping sent 2016-11-29 12:21:21.113662 (cutoff 2016-11-29 12:26:13.558819)
2016-11-29 12:26:33.770524 7ffbc37c5700 -1 osd.1 82 heartbeat_check: no reply from osd.3 ever on either front or back, first ping sent 2016-11-29 12:21:21.113662 (cutoff 2016-11-29 12:26:13.770520)
2016-11-29 12:26:34.659291 7ffbaadfe700 -1 osd.1 82 heartbeat_check: no reply from osd.3 ever on either front or back, first ping sent 2016-11-29 12:21:21.113662 (cutoff 2016-11-29 12:26:14.659289)
2016-11-29 12:26:34.770786 7ffbc37c5700 -1 osd.1 82 heartbeat_check: no reply from osd.3 ever on either front or back, first ping sent 2016-11-29 12:21:21.113662 (cutoff 2016-11-29 12:26:14.770781)

发现原来是osd.3心跳检查不通过。经排查是防火墙问题

[op@hub hue-metadata]$ sudo iptables -I INPUT -p tcp --dport 6789 -j ACCEPT
[op@hub hue-metadata]$ sudo iptables -I INPUT -p tcp -m multiport --dports 6800:7100 -j ACCEPT
# 再看一下iptables filter表
[op@hub hue-metadata]$ sudo iptables -L -n
Chain INPUT (policy ACCEPT)
target     prot opt source               destination
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 6800:7100
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:6789
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:8080
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:8888
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:8001
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:443
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            tcp dpt:10050
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            state RELATED,ESTABLISHED
ACCEPT     icmp --  0.0.0.0/0            0.0.0.0/0
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            state NEW tcp dpt:22
REJECT     all  --  0.0.0.0/0            0.0.0.0/0            reject-with icmp-host-prohibited
ACCEPT     tcp  --  10.142.0.0/16        0.0.0.0/0            tcp dpt:6789
ACCEPT     tcp  --  10.142.0.0/16        0.0.0.0/0            tcp dpts:6800:7300
ACCEPT     all  --  10.142.0.0/16        0.0.0.0/0
# 其实之前配了,可能是因为配置源IP地址的原因吧,现在配置的是任意源地址都可以通过,至此,问题得以解决。

5. rbd map失败

创建rbd后,打算映射一下,但是报错

[op@docker-rancher-server ceph]$ rbd map docker-volume --pool rbd --id admin
modprobe: ERROR: could not insert 'rbd': Operation not permitted
rbd: failed to load rbd kernel module (1)
rbd: sysfs write failed
In some cases useful info is found in syslog - try "dmesg | tail" or so.
rbd: map failed: (2) No such file or directory
[op@docker-rancher-server ceph]$ sudo rbd map docker-volume --pool rbd --id admin
rbd: sysfs write failed
RBD image feature set mismatch. You can disable features unsupported by the kernel with "rbd feature disable".
In some cases useful info is found in syslog - try "dmesg | tail" or so.
rbd: map failed: (6) No such device or address

故障排查:

rbd 块ceph 支持两种格式:1和2

format 1 - 新建 rbd 映像时使用最初的格式。此格式兼容所有版本的 librbd 和内核模块,但是不支持较新的功能,像克隆。

format 2 - 使用第二版 rbd 格式, librbd 和 3.11 版以上内核模块才支持(除非是分拆的模块)。此格式增加了克隆支持,使得扩展更容易,还允许以后增加新功能。

为使用rbd 块新特性,使用格式2,在map 时发生以上报错:

    查找官网相关资料,找到信息如下:

我们安装的是jewel 版本,新建rbd块指定格式2,默认格式2的rbd 块支持如下特性,默认全部开启;

layering: 支持分层

striping: 支持条带化 v2

exclusive-lock: 支持独占锁

object-map: 支持对象映射(依赖 exclusive-lock )

fast-diff: 快速计算差异(依赖 object-map )

deep-flatten: 支持快照扁平化操作

journaling: 支持记录 IO 操作(依赖独占锁)

笔者使用系统为centos7.1 ,内核版本3.10.0-229.el7.x86_64,根据报错内容提示可知,服务器系统内核版本,不支持有些格式2 的新特性导致。可以使用--image-feature   选项指定使用特性,不用全部开启。我们的需求仅需要使用快照等特性,开启layering即可,

==经测试,内核版本 3.10,仅支持此特性(layering),其它特性需要使用更高版本内核,或者从新编译内核加载特性模块才行。==

参考资料

  1. ceph集群jewel版本 rbd 块map 报错-故障排查

  2. RBD – MANAGE RADOS BLOCK DEVICE (RBD) IMAGES

参考资料

  1. Ceph官网-中文版
  2. Ceph官网-英文版
  3. 本地源安装ceph