Ubuntu22.04LTS基于cephadm快速部署Ceph Reef(18.2.X)集群
1.基础配置
1.基于cephadm部署前提条件,官方提的要求Ubuntu 22.04 LTS除了容器运行时其他都满足
- Python 3
- Systemd
- Podman or Docker for running containers
- Time synchronization (such as Chrony or the legacy ntpd)
- LVM2 for provisioning storage devices
参考链接: https://docs.ceph.com/en/latest/cephadm/install/#requirements
2.设置时区
timedatectl set-timezone Asia/Shanghai
ll /etc/localtime
3.安装docker环境 参考Docker安装与配置
4.添加hosts文件解析
[root@ceph141 ~]# cat >> /etc/hosts <<EOF
192.168.8.141 ceph141
192.168.8.142 ceph142
192.168.8.143 ceph143
EOF
5.集群时间同步【可跳过】 参考链接: https://developer.aliyun.com/article/1604582#3配置时间同步
6.集群环境准备
ceph141:
CPU: 1c
Memory: 2G
/dev/sdb:300GB
/dev/sdc: 500GB
ceph142:
CPU: 1c
Memory: 2G
/dev/sdb:300GB
/dev/sdc: 500GB
/dev/sdd: 1TB
ceph143:
CPU: 1c
Memory: 2G
/dev/sdb:300GB
/dev/sdc: 500GB
2. 启动ceph新集群
1.下载需要安装ceph版本的cephadm
[root@ceph141 ~]# CEPH_RELEASE=18.2.4
[root@ceph141 ~]# curl --silent --remote-name --location https://download.ceph.com/rpm-${CEPH_RELEASE}/el9/noarch/cephadm
2.将cephadm添加到PATH环境变量
[root@ceph141 ~]# mv cephadm /usr/local/bin/
[root@ceph141 ~]# chmod +x /usr/local/bin/cephadm
[root@ceph141 ~]# ls -l /usr/local/bin/cephadm
-rwxr-xr-x 1 root root 215316 Aug 20 22:19 /usr/local/bin/cephadm
3.创建新集群
[root@ceph141 ~]# cephadm bootstrap --mon-ip 192.168.8.141 --cluster-network 192.168.8.0/24 --allow-fqdn-hostname
...
Pulling container image quay.io/ceph/ceph:v18...
Ceph version: ceph version 18.2.4 (e7ad5345525c7aa95470c26863873b581076945d) reef (stable)
...
URL: https://ceph141:8443/
User: admin
Password: ii8p7dzqtt
...
sudo /usr/local/bin/cephadm shell --fsid c044ff3c-5f05-11ef-9d8b-51db832765d6 -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring
Or, if you are only running a single cluster on this host:
sudo /usr/local/bin/cephadm shell
Please consider enabling telemetry to help improve Ceph:
ceph telemetry on
For more information see:
https://docs.ceph.com/en/latest/mgr/telemetry/
Bootstrap complete.
温馨提示:
- 1.此步骤会去官方下载镜像,我们可以将惊喜手动导入;
https://docs.ceph.com/en/latest/install/containers/#containers
- 2.注意观察输出信息,记录dashboard账号信息
3.初始化dashboard的管理员密码
如上图所示,我们首次登陆需要修改密码,按照你的环境自行修改即可。
如下图所示,密码修改成功后就可以登陆dashboard页面啦~
温馨提示:
- 1.先配置ceph管理员节点
参考连接:
https://developer.aliyun.com/article/1604942
- 2.除了上面在WebUI的方式修改密码外,也可以基于命令行方式修改密码,只不过在应用时可能需要等待一段时间才能生效。【目前官方已经弃用,大概需要等30s-1min】
[root@ceph141 ~]# echo root123 | ceph dashboard set-login-credentials admin -i -
******************************************************************
*** WARNING: this command is deprecated. ***
*** Please use the ac-user-* related commands to manage users. ***
******************************************************************
Username and password updated
[root@ceph141 ~]#
4.ceph集群添加或移除主机
1.安装ceph命令
apt install ceph-common
2.查看现有的集群主机列表
[root@ceph141 ~]# ceph orch host ls
HOST ADDR LABELS STATUS
ceph141 192.168.8.141 _admin
1 hosts in cluster
[root@ceph141 ~]#
3.把秘钥放到其他服务器上
[root@ceph141 ~]# ssh-copy-id -f -i /etc/ceph/ceph.pub ceph142
[root@ceph141 ~]# ssh-copy-id -f -i /etc/ceph/ceph.pub ceph143
4.将秘钥节点加入到集群
[root@ceph141 ~]# ceph orch host add ceph142 192.168.8.142
Added host 'ceph142' with addr '192.168.8.142'
[root@ceph141 ~]#
[root@ceph141 ~]# ceph orch host add ceph143 192.168.8.143
Added host 'ceph143' with addr '192.168.8.143'
[root@ceph141 ~]#
温馨提示:
将集群加入成功后,会自动创建"/var/lib/ceph/<Ceph_Cluster_ID>"相关数据目录。
5.再次查看集群的主机列表
[root@ceph141 ~]# ceph orch host ls
HOST ADDR LABELS STATUS
ceph141 192.168.8.141 _admin
ceph142 192.168.8.142
ceph143 192.168.8.143
3 hosts in cluster
[root@ceph141 ~]#
温馨提示:
当然,也可以通过查看WebUI观察ceph集群有多少个主机。
https://ceph141:8443/#/hosts
6.移除主机【选做,如果你将来真有这个需求在操作】
[root@ceph141 ~]# ceph orch host rm ceph143
Removed host 'ceph143'
[root@ceph141 ~]#
[root@ceph141 ~]# ceph orch host ls
HOST ADDR LABELS STATUS
ceph141 192.168.8.141 _admin
ceph142 192.168.8.142
2 hosts in cluster
[root@ceph141 ~]#
[root@ceph141 ~]#
[root@ceph141 ~]# ceph orch host add ceph143 192.168.8.143 # 为了实验效果,我还是将ceph143加回来
Added host 'ceph143' with addr '192.168.8.143'
[root@ceph141 ~]#
[root@ceph141 ~]# ceph orch host ls
HOST ADDR LABELS STATUS
ceph141 192.168.8.141 _admin
ceph142 192.168.8.142
ceph143 192.168.8.143
3 hosts in cluster
[root@ceph141 ~]#
5.添加OSD设备到ceph集群
1.添加OSD之前环境查看
1.1 查看集群可用的设备【每个设备想要加入到集群,则其大小不得小于5GB】
[root@ceph141 ~]# ceph orch device ls
HOST PATH TYPE DEVICE ID SIZE AVAILABLE REFRESHED REJECT REASONS
ceph141 /dev/sdb hdd 300G Yes 3m ago
ceph141 /dev/sdc hdd 500G Yes 3m ago
ceph141 /dev/sr0 hdd VMware_Virtual_SATA_CDRW_Drive_01000000000000000001 1023M No 3m ago Failed to determine if device is BlueStore, Insufficient space (<5GB)
ceph142 /dev/sdb hdd 300G Yes 3m ago
ceph142 /dev/sdc hdd 500G Yes 3m ago
ceph142 /dev/sdd hdd 1000G Yes 3m ago
ceph142 /dev/sr0 hdd VMware_Virtual_SATA_CDRW_Drive_01000000000000000001 1023M No 3m ago Failed to determine if device is BlueStore, Insufficient space (<5GB)
ceph143 /dev/sdb hdd 300G Yes 17s ago
ceph143 /dev/sdc hdd 500G Yes 17s ago
ceph143 /dev/sr0 hdd VMware_Virtual_SATA_CDRW_Drive_01000000000000000001 1023M No 17s ago Failed to determine if device is BlueStore, Insufficient space (<5GB)
[root@ceph141 ~]#
温馨提示:
如果一个设备想要加入ceph集群,要求满足2个条件:
- 1.设备未被使用;
- 2.设备的存储大小必须大于5GB;
- 3.需要等待一段时间,快则30s,慢则3分钟,线下教学有人笔记本性能不高甚至有等了25min或者40min才等出现设备结果;
1.2 查看各节点的空闲设备信息
[root@ceph141 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
...
sdb 8:16 0 300G 0 disk
sdc 8:32 0 500G 0 disk
...
[root@ceph142 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
...
sdb 8:16 0 300G 0 disk
sdc 8:32 0 500G 0 disk
sdd 8:48 0 1000G 0 disk
...
[root@ceph143 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
...
sdb 8:16 0 300G 0 disk
sdc 8:32 0 500G 0 disk
...
[root@ceph143 ~]#
1.3 查看OSD列表
[root@ceph141 ~]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0 root default
2.添加OSD设备到集群
2.1 添加ceph141节点的设备到ceph集群
[root@ceph141 ~]# ceph orch daemon add osd ceph141:/dev/sdb
Created osd(s) 0 on host 'ceph141'
[root@ceph141 ~]# ceph orch daemon add osd ceph141:/dev/sdc
Created osd(s) 1 on host 'ceph141'
[root@ceph141 ~]# ceph orch daemon add osd ceph142:/dev/sdb
Created osd(s) 2 on host 'ceph142'
[root@ceph141 ~]# ceph orch daemon add osd ceph142:/dev/sdc
Created osd(s) 3 on host 'ceph142'
[root@ceph141 ~]# ceph orch daemon add osd ceph143:/dev/sdb
Created osd(s) 4 on host 'ceph143'
[root@ceph141 ~]# ceph orch daemon add osd ceph143:/dev/sdc
Created osd(s) 5 on host 'ceph143'
[root@ceph141 ~]# ceph orch daemon add osd ceph142:/dev/sdd
Created osd(s) 6 on host 'ceph142'
温馨提示:
- 1.此步骤会在"/var/lib/ceph/<Ceph_Cluster_ID>/osd.<OSD_ID>/fsid"文件中记录对应ceph的OSD编号对应本地的磁盘设备标识。
- 2.比如查看ceph142节点的硬盘和OSD的对应关系如下:
[root@ceph142 ~]# ll -d /var/lib/ceph/3cb12fba-5f6e-11ef-b412-9d303a22b70f/osd.*
drwx------ 2 167 167 4096 Aug 21 15:18 /var/lib/ceph/3cb12fba-5f6e-11ef-b412-9d303a22b70f/osd.2/
drwx------ 2 167 167 4096 Aug 21 15:19 /var/lib/ceph/3cb12fba-5f6e-11ef-b412-9d303a22b70f/osd.3/
drwx------ 2 167 167 4096 Aug 21 15:22 /var/lib/ceph/3cb12fba-5f6e-11ef-b412-9d303a22b70f/osd.6/
[root@ceph142 ~]# cat /var/lib/ceph/3cb12fba-5f6e-11ef-b412-9d303a22b70f/osd.*/fsid
68ff55fb-358a-4014-ba0e-075adb18c6d9
b9096186-53af-4ca0-b233-01fd913bdaba
d4ccefb2-5812-4ca2-97ca-9642ff4539f2
[root@ceph142 ~]# lsblk # 不难发现,ceph底层是基于lvm技术磁盘的。
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
...
sdb 8:16 0 300G 0 disk
└─ceph--bb7e7dd0--d4e2--4da2--9cfb--da1dcd70222d-osd--block--68ff55fb--358a--4014--ba0e--075adb18c6d9
253:1 0 300G 0 lvm
sdc 8:32 0 500G 0 disk
└─ceph--5b511438--e561--456f--a33e--82bfc9c4abfd-osd--block--b9096186--53af--4ca0--b233--01fd913bdaba
253:2 0 500G 0 lvm
sdd 8:48 0 1000G 0 disk
└─ceph--0d9e77e1--051d--4ba6--8274--cfd85e213ab9-osd--block--d4ccefb2--5812--4ca2--97ca--9642ff4539f2
253:3 0 1000G 0 lvm
...
[root@ceph142 ~]#
2.2 查看集群的osd总容量大小
[root@ceph141 ~]# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 3.32048 root default
-3 0.78130 host ceph141
0 hdd 0.29300 osd.0 up 1.00000 1.00000
1 hdd 0.48830 osd.1 up 1.00000 1.00000
-5 1.75789 host ceph142
2 hdd 0.29300 osd.2 up 1.00000 1.00000
3 hdd 0.48830 osd.3 up 1.00000 1.00000
6 hdd 0.97659 osd.6 up 1.00000 1.00000
-7 0.78130 host ceph143
4 hdd 0.29300 osd.4 up 1.00000 1.00000
5 hdd 0.48830 osd.5 up 1.00000 1.00000
[root@ceph141 ~]#
2.3 查看集群的大小
[root@ceph141 ~]# ceph -s
cluster:
id: 3cb12fba-5f6e-11ef-b412-9d303a22b70f
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph141,ceph142,ceph143 (age 19m)
mgr: ceph141.cwgrgj(active, since 3h), standbys: ceph142.ymuzfe
osd: 7 osds: 7 up (since 2m), 7 in (since 3m)
data:
pools: 1 pools, 1 pgs
objects: 2 objects, 449 KiB
usage: 188 MiB used, 3.3 TiB / 3.3 TiB avail # 注意,这是咱们的集群大小共计3.3TB!
pgs: 1 active+clean
3.当然,也可以在dashboard中查看OSD信息
https://ceph141:8443/#/osd
三.测试集群可用性
1 测试集群
1.创建存储池
[root@ceph141 ~]# ceph osd pool create p1
pool 'p1' created
2.往存储池上传文件
[root@ceph141 ~]# rados -p p1 put sys.txt /etc/os-release
3.查看存储池上传的文件
[root@ceph141 ~]# rados -p p1 ls
sys.txt
4.查看存储池文件的状态信息
[root@ceph141 ~]# rados -p p1 stat sys.txt
p1/sys.txt mtime 2024-08-25T09:49:58.000000+0800, size 386
[root@ceph141 ~]#
5.查看PG的副本在哪些OSD上
[root@ceph141 ~]# ceph osd map p1 sys.txt # 不难发现,pg到三个副本分别在2,1,5哟~
osdmap e64 pool 'p1' (2) object 'sys.txt' -> pg 2.486f5322 (2.2) -> up ([2,1,5], p2) acting ([2,1,5], p2)
6.储池文件下载到本地
[root@ceph141 ~]# rados -p p1 get sys.txt a.txt
[root@ceph141 ~]# ls
a.txt
[root@ceph141 ~]# cat a.txt
PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
7.删除文件
[root@ceph141 ~]# rados -p p1 rm sys.txt
[root@ceph141 ~]# rados -p p1 ls
[root@ceph141 ~]# ceph osd map p1 sys.txt # 删除文件后不难发现映射信息还在,那如何删除这些映射信息呢?我们后面会陆续讲解到。
osdmap e64 pool 'p1' (2) object 'sys.txt' -> pg 2.486f5322 (2.2) -> up ([2,1,5], p2) acting ([2,1,5], p2)
2.ceph集群拍快照
温馨提示:
关机,拍快照!
3 推荐阅读
ceph集群的OSD管理基础及OSD节点扩缩容:
https://developer.aliyun.com/article/1604940
cephadm访问ceph集群的方式及管理员节点配置案例
https://developer.aliyun.com/article/1604942