rook-ceph osd down问题排查
rook ceph osd 異常(down)問題排查
初始化問題顯現,如下:
[root@rook-ceph-tools-78cdfd976c-dhrlx /]# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF-1 15.00000 root default -11 3.00000 host master1 4 hdd 1.00000 osd.4 up 1.00000 1.000009 hdd 1.00000 osd.9 down 0 1.0000014 hdd 1.00000 osd.14 up 1.00000 1.00000在檢查ceph集群狀態,發現: 37 daemons have recently crashed
[root@rook-ceph-tools-78cdfd976c-dhrlx osd]# ceph -scluster:id: f65c0ebc-0ace-4181-8061-abc2d1d581e9health: HEALTH_WARN37 daemons have recently crashedservices:mon: 3 daemons, quorum a,c,g (age 9m)mgr: a(active, since 13d)mds: 1/1 daemons up, 1 hot standbyosd: 15 osds: 14 up (since 10m), 14 in (since 2h)data:volumes: 1/1 healthypools: 4 pools, 97 pgsobjects: 20.64k objects, 72 GiBusage: 216 GiB used, 14 TiB / 14 TiB availpgs: 97 active+cleanio:client: 8.8 KiB/s rd, 1.2 MiB/s wr, 2 op/s rd, 49 op/s wr判斷這里顯示的應該是歷史故障信息,查看歷史crash:
ceph crash ls-new 2022-05-13T01:46:58.600474Z_11da8241-7462-49b5-8ab6-83e96d0dd1d9查看crash日志
ceph crash info 2022-05-13T01:46:58.600474Z_11da8241-7462-49b5-8ab6-83e96d0dd1d9
2393> 2020-05-13 10:24:55.180 7f5d5677aa80 -1 Falling back to public interface -1754> 2020-05-13 10:25:07.419 7f5d5677aa80 -1 osd.2 875 log_to_monitors {default=true} -1425> 2020-05-13 10:25:07.803 7f5d48d7c700 -1 osd.2 875 set_numa_affinity unable to identify public interface 'eth0' numa node: (2) No such file or directory -2> 2020-05-13 10:25:23.731 7f5d4436d700 -1 rocksdb: submit_common error: Corruption: block checksum mismatch: expected 717694145, got 2263389519 in db/001499.sst offset 43727772 size 3899 code = 2 Rocksdb transaction: -1> 2020-05-13 10:25:23.735 7f5d4436d700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.9/rpm/el7/BUILD/ceph-14.2.9/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::_kv_sync_thread()' thread 7f5d4436d700 time 2020-05-13 10:25:23.733456 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.9/rpm/el7/BUILD/ceph-14.2.9/src/os/bluestore/BlueStore.cc: 11016: FAILED ceph_assert(r == 0)ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) nautilus (stable)1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14a) [0x56297aa20f7d]2: (()+0x4cb145) [0x56297aa21145]3: (BlueStore::_kv_sync_thread()+0x11c3) [0x56297af95233]4: (BlueStore::KVSyncThread::entry()+0xd) [0x56297afba3fd]5: (()+0x7e65) [0x7f5d537bfe65]6: (clone()+0x6d) [0x7f5d5268388d]0> 2020-05-13 10:25:23.735 7f5d4436d700 -1 *** Caught signal (Aborted) **in thread 7f5d4436d700 thread_name:bstore_kv_syncceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) nautilus (stable)1: (()+0xf5f0) [0x7f5d537c75f0]2: (gsignal()+0x37) [0x7f5d525bb337]3: (abort()+0x148) [0x7f5d525bca28]4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x199) [0x56297aa20fcc]5: (()+0x4cb145) [0x56297aa21145]6: (BlueStore::_kv_sync_thread()+0x11c3) [0x56297af95233]7: (BlueStore::KVSyncThread::entry()+0xd) [0x56297afba3fd]8: (()+0x7e65) [0x7f5d537bfe65]9: (clone()+0x6d) [0x7f5d5268388d]NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.問題原因是一個“rocksdb: submit_common error: Corruption: block checksum mismatch: expected 717694145, got 2263389519 in db/001499.sst offset 43727772 size 3899 code = 2 Rocksdb transaction”,assert出錯,OSD程序就一直啟動不了。那么如何解決這個block mismatch問題呢?
問題分析
上面這個問題里的一個關鍵字是rocksdb,這是什么呢?Ceph的文件存儲引擎默認是filestore,為了改善性能,如今改為了bluestore,而bluestore引擎的metadata就存放在rocksdb中。這說明:Ceph的文件存儲引擎bluestore的元數據損壞了!
解決步驟
直接恢復是恢復不回來了,于是刪掉這個對應的OSD,再重新加回來。
1,查看當前OSD的狀態
[root@rook-ceph-tools-7bbsyszux-584k5 /]# ceph osd status +----+------+-------+-------+--------+---------+--------+---------+----------------+ | id | host | used | avail | wr ops | wr data | rd ops | rd data | state | +----+------+-------+-------+--------+---------+--------+---------+----------------+ | 0 | ai05 | 299G | 3426G | 0 | 0 | 5 | 382k | exists,up | | 1 | ai05 | 178G | 3547G | 0 | 18 | 2 | 1110k | exists,up | | 2 | ai03 | 108G | 3617G | 0 | 944 | 5 | 84.0k | exists,up | | 3 | ai01 | 438G | 3287G | 0 | 763 | 7 | 708k | exists,up | | 4 | ai03 | 217G | 3508G | 0 | 339 | 7 | 63.6k | exists,up | | 5 | ai02 | 217G | 2576G | 1 | 10.9k | 6 | 403k | exists,up | | 6 | ai04 | 300G | 3425G | 15 | 100k | 7 | 161k | exists,up | | 7 | ai03 | 109G | 3616G | 0 | 0 | 0 | 0 | exists,up | | 8 | ai02 | 246G | 3479G | 1 | 23.6k | 2 | 813k | exists,up | | 9 | | 0 | 0 | 0 | 0 | 0 | 0 | autoout,exists | | 10 | ai03 | 136G | 3589G | 0 | 741 | 4 | 679k | exists,up | | 11 | ai03 | 162G | 3563G | 0 | 22.2k | 4 | 824k | exists,up | | 12 | ai03 | 55.7G | 3670G | 0 | 0 | 2 | 952k | exists,up | | 13 | ai01 | 194G | 3531G | 0 | 130k | 3 | 37.9k | exists,up | +----+------+-------+-------+--------+---------+--------+---------+----------------+2,把出問題的OSD標記為out
[root@rook-ceph-tools-7gemfield-584k5 /]# ceph osd out osd.9 osd.2 is already out.3、查找出OSD對應的磁盤
[root@master1 ~]# kubectl get po rook-ceph-osd-9-7dd6fc544c-4vhtm -n rook-ceph -o yaml |grep UUID- name: ROOK_OSD_UUID-o xtrace\n\nOSD_ID=\"$ROOK_OSD_ID\"\nOSD_UUID=052383d6-90ca-4ea1-a9c0-bcb0c43d8317\nOSD_STORE_FLAG=\"--bluestore\"\nOSD_DATA_DIR=/var/lib/ceph/osd/ceph-\"$OSD_ID\"\nCV_MODE=lvm\nDEVICE=\"$ROOK_BLOCK_PATH\"\n\n#\"$OSD_ID\" \"$OSD_UUID\"\n\n\t# copy the tmpfs directory to a temporary directory\n\t# [root@master1 ~]# lsblk |grep -C2 052383d6 rbd8 251:128 0 5G 0 disk /var/lib/kubelet/pods/7b39990a-ea1c-4f00-a767-a9fbc4a19ecd/volumes/kubernetes.io~csi/pvc-f78f0dd9-188c-4d02-aed0-03f25ed4d0a0/mount vdc 252:32 0 1T 0 disk └─ceph--66c4c661--cf98--417b--afda--f79c3de1204c-osd--block--052383d6--90ca--4ea1--a9c0--bcb0c43d8317 253:3 0 1024G 0 lvm rbd12 251:192 0 10G 0 disk /var/lib/kubelet/pods/bfc62153-6844-498c-92f0-e86d09e8a7cc/volumes/kubernetes.io~csi/pvc-051b9632-fe52-4201-9572-79a75793ffb5/mount rbd6 251:96 0 5G 0 disk /var/lib/kubelet/pods/b36acdab-1a0c-4ce4-b5a6-7aca039514ed/volumes/kubernetes.io~csi/pvc-7f6a160b-0e8e-46f8-989e-531667a13a3a/mount檢查哈是否有硬件報錯,如下沒發現具體的硬件報錯
[root@master1 ~]# dmesg |grep vdc [ 2.630026] virtio_blk virtio3: [vdc] 2147483648 512-byte logical blocks (1.10 TB/1.00 TiB)檢查對應osd的相關信息
[root@rook-ceph-tools-78cdfd976c-dhrlx /]# ceph device ls-by-daemon osd.9 DEVICE HOST:DEV EXPECTED FAILURE 4033036832428-3 master1:vdc4,檢查確認磁盤是否正確**
要細致,別刪錯硬盤。
gemfield@ai04:~$ sudo hdparm -I /dev/vdc | grep 4033036832428-3Serial Number: 4033036832428-35,purge掉osd.2**
得加上–force
[root@rook-ceph-tools-7bb5797c8-ns4bw /]# ceph osd purge osd.9 --force [root@rook-ceph-tools-7bb5797c8-ns4bw /]# ceph auth del osd.9 #清理認證信息6,清除OSD的Pod**
未設置的removeOSDsIfOutAndSafeToRemove: false,所以壞掉的OSD不會被自動刪除,需要手動清除掉rook-ceph-osd-9:
[root@master1 ~]# kubectl -n rook-ceph delete deployment rook-ceph-osd-9 deployment.apps "rook-ceph-osd-9" deleted7,徹底清理掉vdc**
[root@master1 ~]# DISK="/dev/vdc" [root@master1 ~]# sudo sgdisk --zap-all $DISK [root@master1 ~]# sudo dd if=/dev/zero of="$DISK" bs=1M count=100 oflag=direct,dsync #注如果是ssd盤請用 blkdiscard /dev/vdc [root@master1 ~]# ls /dev/mapper/ceph-* /dev/mapper/ceph--971efece--8880--4e81--90c6--621493c66294-osd--data--7775b10e--7a0d--4ddd--aaf7--74c4498552ff /dev/mapper/ceph--a7d7b063--7092--4698--a832--1cdd1285acbd-osd--data--ec2df8ee--0a7a--407f--afe3--41d045e889a9#清理掉lvm的殘余,刪除對應的邏輯卷 [root@master1 ~]# sudo dmsetup remove /dev/mapper/ceph--a7d7b063--7092--4698--a832--1cdd1285acbd-osd--data--ec2df8ee--0a7a--407f--afe3--41d045e889a9#查看還剩余一個 [root@master1 ~]# ls /dev/mapper/ceph-* /dev/mapper/ceph--971efece--8880--4e81--90c6--621493c66294-osd--data--7775b10e--7a0d--4ddd--aaf7--74c4498552ff#確保/dev下還剩一個 [root@master1 ~]# ls -l /dev/ceph-* total 0 lrwxrwxrwx 1 root root 7 May 15 20:14 osd-data-7775b10e-7a0d-4ddd-aaf7-74c4498552ff ->[root@master1 ~]# partprobe /dev/vdc8、重啟ceph operator調度,使檢測到格式化后的osd硬盤,osd啟動后ceph集群會自動平衡數據
kubectl rollout restart deploy rook-ceph-operator -n rook-ceph該操作會重新去檢查和調度rook-ceph的創建過程
等完成后在檢查集群狀態
[root@master1 ~]# kubectl get po -n rook-ceph NAME READY STATUS RESTARTS AGE csi-cephfsplugin-6rrgv 3/3 Running 15 167d csi-cephfsplugin-6t7kg 3/3 Running 15 167d csi-cephfsplugin-7ksh2 3/3 Running 15 167d csi-cephfsplugin-mr5z7 3/3 Running 21 167d csi-cephfsplugin-provisioner-7bcbf457c5-hv5nv 6/6 Running 284 167d csi-cephfsplugin-provisioner-7bcbf457c5-qk9t6 6/6 Running 23 45d csi-cephfsplugin-zsf6w 3/3 Running 30 167d csi-rbdplugin-5tsqc 3/3 Running 19 167d csi-rbdplugin-8d6m5 3/3 Running 15 167d csi-rbdplugin-998lx 3/3 Running 15 167d csi-rbdplugin-jx676 3/3 Running 30 167d csi-rbdplugin-njmtd 3/3 Running 21 167d csi-rbdplugin-provisioner-69f65b7897-jh88t 6/6 Running 54 45d csi-rbdplugin-provisioner-69f65b7897-qxpdr 6/6 Running 65 45d rook-ceph-crashcollector-master1-84899f577b-fnf5f 1/1 Running 3 45d rook-ceph-crashcollector-master2-6f7c4fb8d5-lzkf7 1/1 Running 3 45d rook-ceph-crashcollector-master3-695b549f6b-gtpx7 1/1 Running 3 128d rook-ceph-crashcollector-node1-67458cc896-pf6nx 1/1 Running 3 49d rook-ceph-crashcollector-node2-5458f6f68c-nsd84 1/1 Running 3 42d rook-ceph-mds-myfs-a-58f484bd6b-wxzts 1/1 Running 86 45d rook-ceph-mds-myfs-b-669b684d78-mqfct 1/1 Running 13 128d rook-ceph-mgr-a-85954dfbc5-zxtmk 1/1 Running 8 128d rook-ceph-mon-a-5ff4694d9-dc6v6 1/1 Running 4 54m rook-ceph-mon-c-868f4547cc-s97vv 1/1 Running 12 167d rook-ceph-mon-g-fb46bdf77-g5k98 1/1 Running 10 49d rook-ceph-operator-74646576d7-bkcq7 1/1 Running 0 67m rook-ceph-osd-0-5d94784b45-xr5fr 1/1 Running 6 51d rook-ceph-osd-1-98b84c76-5w6s8 1/1 Running 4 42d rook-ceph-osd-10-75c65bc759-wkzjz 1/1 Running 4 42d rook-ceph-osd-11-855495cf97-dvwp9 1/1 Running 7 51d rook-ceph-osd-12-7d55b9ddbd-hqbb4 1/1 Running 10 49d rook-ceph-osd-13-6bfc5b744-mhxw9 1/1 Running 13 167d rook-ceph-osd-14-7cd656d799-shtnr 1/1 Running 118 45d rook-ceph-osd-2-56c45f9db4-lzgbn 1/1 Running 9 49d rook-ceph-osd-3-6d9bdb7fd6-r6cgw 1/1 Running 13 167d rook-ceph-osd-4-5c8fb468c7-c6v9x 1/1 Running 61 45d rook-ceph-osd-5-85b7ff6578-zjgmw 1/1 Running 6 51d rook-ceph-osd-6-67dfcbc7c9-5vtjx 1/1 Running 5 42d rook-ceph-osd-7-5d86487c7-dnmkv 1/1 Running 9 49d rook-ceph-osd-8-5648594c55-gs7bb 1/1 Running 13 167d rook-ceph-osd-9-7dd6fc544c-7pw8t 1/1 Running 0 16s rook-ceph-osd-prepare-master1-qh9j9 0/1 Completed 0 58m rook-ceph-osd-prepare-master2-2d9q7 0/1 Completed 0 58m rook-ceph-osd-prepare-master3-pndv9 0/1 Completed 0 58m rook-ceph-osd-prepare-node1-5dbdq 0/1 Completed 0 58m rook-ceph-osd-prepare-node2-4lk9l 0/1 Completed 0 58m rook-ceph-tools-78cdfd976c-dhrlx 1/1 Running 3 45d [root@rook-ceph-tools-78cdfd976c-dhrlx /]# ceph -scluster:id: f65c0ebc-0ace-4181-8061-abc2d1d581e9health: HEALTH_OK [root@rook-ceph-tools-78cdfd976c-dhrlx /]# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF-1 15.00000 root default -11 3.00000 host master1 4 hdd 1.00000 osd.4 up 1.00000 1.000009 hdd 1.00000 osd.9 up 1.00000 1.0000014 hdd 1.00000 osd.14 up 1.00000 1.00000-7 3.00000 host master2 0 hdd 1.00000 osd.0 up 1.00000 1.000005 hdd 1.00000 osd.5 up 1.00000 1.0000011 hdd 1.00000 osd.11 up 1.00000 1.00000-9 3.00000 host master3 3 hdd 1.00000 osd.3 up 1.00000 1.000008 hdd 1.00000 osd.8 up 1.00000 1.0000013 hdd 1.00000 osd.13 up 1.00000 1.00000-5 3.00000 host node1 2 hdd 1.00000 osd.2 up 1.00000 1.000007 hdd 1.00000 osd.7 up 1.00000 1.0000012 hdd 1.00000 osd.12 up 1.00000 1.00000-3 3.00000 host node2 1 hdd 1.00000 osd.1 up 1.00000 1.000006 hdd 1.00000 osd.6 up 1.00000 1.0000010 hdd 1.00000 osd.10 up 1.00000 1.00000此時rook-ceph集群恢復正常
總結
以上是生活随笔為你收集整理的rook-ceph osd down问题排查的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 从古代文学到云端技术
- 下一篇: python decode函数的用法_d