使用cephadm部署ceph集群

时间:2023-02-15 17:21:48

一、cephadm使用条件

Cephadm通过bootstrapping在单个主机上创建一个新的Ceph集群,扩展集群以包含任何其他主机,然后部署所需的服务。

这台主机的操作系统要求:

1、有python3

yum -y install python3

2、使用Systemd(可以使用systemctl管理服务)

3、有podman或者docker来运行容器

# 安装阿里云提供的docker-ce
yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
sed -i 's+download.docker.com+mirrors.aliyun.com/docker-ce+' /etc/yum.repos.d/docker-ce.repo
yum -y install docker-ce
systemctl enable docker --now
# 配置镜像加速器
mkdir -p /etc/docker
tee /etc/docker/daemon.json <<-'EOF'
{
"registry-mirrors": ["https://bp1bh1ga.mirror.aliyuncs.com"]
}
EOF
systemctl daemon-reload
systemctl restart docker

4、时间同步(比如chrony或者NTP)

二、部署ceph集群前准备

2.1、节点准备

节点名称

系统

IP地址

ceph角色

硬盘

node1

Rocky Linux release 8.6

172.24.1.6

mon,mgr,服务器端,管理节点

/dev/vdb,/dev/vdc/,dev/vdd

node2

Rocky Linux release 8.6

172.24.1.7

mon,mgr

/dev/vdb,/dev/vdc/,dev/vdd

node3

Rocky Linux release 8.6

172.24.1.8

mon,mgr

/dev/vdb,/dev/vdc/,dev/vdd

node4

Rocky Linux release 8.6

172.24.1.9

客户端,管理节点


2.2、修改每个节点的/etc/host

172.24.1.6 node1
172.24.1.7 node2
172.24.1.8 node3
172.24.1.9 node4

2.3、在node1节点上做免密登录

[root@node1 ~]# ssh-keygen
[root@node1 ~]# ssh-copy-id root@node2
[root@node1 ~]# ssh-copy-id root@node3
[root@node1 ~]# ssh-copy-id root@node4

三、node1节点安装cephadm

1.安装epel源
[root@node1 ~]# yum -y install epel-release
2.安装ceph源
[root@node1 ~]# yum search release-ceph
上次元数据过期检查:0:57:14 前,执行于 2023年02月14日 星期二 14时22分00秒。
================= 名称 匹配:release-ceph ============================================
centos-release-ceph-nautilus.noarch : Ceph Nautilus packages from the CentOS Storage SIG repository
centos-release-ceph-octopus.noarch : Ceph Octopus packages from the CentOS Storage SIG repository
centos-release-ceph-pacific.noarch : Ceph Pacific packages from the CentOS Storage SIG repository
centos-release-ceph-quincy.noarch : Ceph Quincy packages from the CentOS Storage SIG repository
[root@node1 ~]# yum -y install centos-release-ceph-pacific.noarch
3.安装cephadm
[root@node1 ~]# yum -y install cephadm
4.安装ceph-common
[root@node1 ~]# yum -y install ceph-common

四、其它节点安装docker-ce,python3

具体过程看标题一。

五、部署ceph集群

5.1、部署ceph集群,顺便把dashboard(图形控制界面)安装上

[root@node1 ~]# cephadm bootstrap --mon-ip 172.24.1.6 --allow-fqdn-hostname --initial-dashboard-user admin --initial-dashboard-password redhat --dashboard-password-noupdate
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
docker (/usr/bin/docker) is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
Cluster fsid: 0b565668-ace4-11ed-960c-5254000de7a0
Verifying IP 172.24.1.6 port 3300 ...
Verifying IP 172.24.1.6 port 6789 ...
Mon IP `172.24.1.6` is in CIDR network `172.24.1.0/24`
- internal network (--cluster-network) has not been provided, OSD replication will default to the public_network
Pulling container image quay.io/ceph/ceph:v16...
Ceph version: ceph version 16.2.11 (3cf40e2dca667f68c6ce3ff5cd94f01e711af894) pacific (stable)
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Waiting for mon to start...
Waiting for mon...
mon is available
Assimilating anything we can from ceph.conf...
Generating new minimal ceph.conf...
Restarting the monitor...
Setting mon public_network to 172.24.1.0/24
Wrote config to /etc/ceph/ceph.conf
Wrote keyring to /etc/ceph/ceph.client.admin.keyring
Creating mgr...
Verifying port 9283 ...
Waiting for mgr to start...
Waiting for mgr...
mgr not available, waiting (1/15)...
mgr not available, waiting (2/15)...
mgr not available, waiting (3/15)...
mgr is available
Enabling cephadm module...
Waiting for the mgr to restart...
Waiting for mgr epoch 5...
mgr epoch 5 is available
Setting orchestrator backend to cephadm...
Generating ssh key...
Wrote public SSH key to /etc/ceph/ceph.pub
Adding key to root@localhost authorized_keys...
Adding host node1...
Deploying mon service with default placement...
Deploying mgr service with default placement...
Deploying crash service with default placement...
Deploying prometheus service with default placement...
Deploying grafana service with default placement...
Deploying node-exporter service with default placement...
Deploying alertmanager service with default placement...
Enabling the dashboard module...
Waiting for the mgr to restart...
Waiting for mgr epoch 9...
mgr epoch 9 is available
Generating a dashboard self-signed certificate...
Creating initial admin user...
Fetching dashboard port number...
Ceph Dashboard is now available at:

URL: https://node1.domain1.example.com:8443/
User: admin
Password: redhat

Enabling client.admin keyring and conf on hosts with "admin" label
You can access the Ceph CLI with:

sudo /usr/sbin/cephadm shell --fsid 0b565668-ace4-11ed-960c-5254000de7a0 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring

Please consider enabling telemetry to help improve Ceph:

ceph telemetry on

For more information see:

https://docs.ceph.com/docs/pacific/mgr/telemetry/

Bootstrap complete.

5.2、把密钥拷贝到各节点

[root@node1 ~]# ssh-copy-id -f -i /etc/ceph/ceph.pub root@node2
[root@node1 ~]# ssh-copy-id -f -i /etc/ceph/ceph.pub root@node3
[root@node1 ~]# ssh-copy-id -f -i /etc/ceph/ceph.pub root@node4

5.3、添加节点node2,node3,node4(各节点要先安装docker-ce,python3)

[root@node1 ~]# ceph orch host add node2 172.24.1.7
Added host 'node2' with addr '172.24.1.7'
[root@node1 ~]# ceph orch host add node3 172.24.1.8
Added host 'node3' with addr '172.24.1.8'
[root@node1 ~]# ceph orch host add node4 172.24.1.9
Added host 'node4' with addr '172.24.1.9'

5.4、给node1、node4打上管理员标签,拷贝ceph配置文件和keyring到node4

[root@node1 ~]# ceph orch host label add node1 _admin
Added label _admin to host node1
[root@node1 ~]# ceph orch host label add node4 _admin
Added label _admin to host node4
[root@node1 ~]# scp /etc/ceph/{*.conf,*.keyring} root@node4:/etc/ceph
[root@node1 ~]# ceph orch host ls
HOST ADDR LABELS STATUS
node1 172.24.1.6 _admin
node2 172.24.1.7
node3 172.24.1.8
node4 172.24.1.9 _admin

5.5、添加mon

[root@node1 ~]# ceph orch apply mon "node1,node2,node3"
Scheduled mon update...

5.6、添加mgr

[root@node1 ~]# ceph orch apply mgr --placement="node1,node2,node3"
Scheduled mgr update...

5.7、添加osd

[root@node1 ~]# ceph orch daemon add osd node1:/dev/vdb
[root@node1 ~]# ceph orch daemon add osd node1:/dev/vdc
[root@node1 ~]# ceph orch daemon add osd node1:/dev/vdd
[root@node1 ~]# ceph orch daemon add osd node2:/dev/vdb
[root@node1 ~]# ceph orch daemon add osd node2:/dev/vdc
[root@node1 ~]# ceph orch daemon add osd node2:/dev/vdd
[root@node1 ~]# ceph orch daemon add osd node3:/dev/vdb
[root@node1 ~]# ceph orch daemon add osd node3:/dev/vdc
[root@node1 ~]# ceph orch daemon add osd node3:/dev/vdd
或者:
[root@node1 ~]# for i in node1 node2 node3; do for j in vdb vdc vdd; do ceph orch daemon add osd $i:/dev/$j; done; done
Created osd(s) 0 on host 'node1'
Created osd(s) 1 on host 'node1'
Created osd(s) 2 on host 'node1'
Created osd(s) 3 on host 'node2'
Created osd(s) 4 on host 'node2'
Created osd(s) 5 on host 'node2'
Created osd(s) 6 on host 'node3'
Created osd(s) 7 on host 'node3'
Created osd(s) 8 on host 'node3'

[root@node1 ~]# ceph orch device ls
HOST PATH TYPE DEVICE ID SIZE AVAILABLE REFRESHED REJECT REASONS
node1 /dev/vdb hdd 10.7G 4m ago Insufficient space (<10 extents) on vgs, LVM detected, locked
node1 /dev/vdc hdd 10.7G 4m ago Insufficient space (<10 extents) on vgs, LVM detected, locked
node1 /dev/vdd hdd 10.7G 4m ago Insufficient space (<10 extents) on vgs, LVM detected, locked
node2 /dev/vdb hdd 10.7G 3m ago Insufficient space (<10 extents) on vgs, LVM detected, locked
node2 /dev/vdc hdd 10.7G 3m ago Insufficient space (<10 extents) on vgs, LVM detected, locked
node2 /dev/vdd hdd 10.7G 3m ago Insufficient space (<10 extents) on vgs, LVM detected, locked
node3 /dev/vdb hdd 10.7G 90s ago Insufficient space (<10 extents) on vgs, LVM detected, locked
node3 /dev/vdc hdd 10.7G 90s ago Insufficient space (<10 extents) on vgs, LVM detected, locked
node3 /dev/vdd hdd 10.7G 90s ago Insufficient space (<10 extents) on vgs, LVM detected, locked

5.8、至此,ceph集群部署完毕!

[root@node1 ~]# ceph -s
cluster:
id: 0b565668-ace4-11ed-960c-5254000de7a0
health: HEALTH_OK

services:
mon: 3 daemons, quorum node1,node2,node3 (age 7m)
mgr: node1.cxtokn(active, since 14m), standbys: node2.heebcb, node3.fsrlxu
osd: 9 osds: 9 up (since 59s), 9 in (since 81s)

data:
pools: 1 pools, 1 pgs
objects: 0 objects, 0 B
usage: 53 MiB used, 90 GiB / 90 GiB avail
pgs: 1 active+clean

5.9、node4节点管理ceph

# 在目录5.4已经将ceph配置文件和keyring拷贝到node4节点
[root@node4 ~]# ceph -s
-bash: ceph: 未找到命令,需要安装ceph-common
# 安装ceph源
[root@node4 ~]# yum -y install centos-release-ceph-pacific.noarch
# 安装ceph-common
[root@node4 ~]# yum -y install ceph-common
[root@node4 ~]# ceph -s
cluster:
id: 0b565668-ace4-11ed-960c-5254000de7a0
health: HEALTH_OK

services:
mon: 3 daemons, quorum node1,node2,node3 (age 7m)
mgr: node1.cxtokn(active, since 14m), standbys: node2.heebcb, node3.fsrlxu
osd: 9 osds: 9 up (since 59s), 9 in (since 81s)

data:
pools: 1 pools, 1 pgs
objects: 0 objects, 0 B
usage: 53 MiB used, 90 GiB / 90 GiB avail
pgs: 1 active+clean