1. Scenes
Current state of cluster
# ceph -s cluster e6ccdfaa-a729-4638-bcde-e539b1e7a28d health HEALTH_OK monmap e1: 3 mons at {bdc2=172.16.251.2:6789/0,bdc3=172.16.251.3:6789/0,bdc4=172.16.251.4:6789/0} election epoch 82, quorum 0,1,2 bdc2,bdc3,bdc4 osdmap e3132: 27 osds: 26 up, 26 in flags sortbitwise pgmap v13259021: 4096 pgs, 4 pools, 2558 GB data, 638 kobjects 7631 GB used, 89048 GB / 96680 GB avail 4096 active+clean client io 34720 kB/s wr, 0 op/s rd, 69 op/s wr
You can see that the state of the cluster is OK, but 27 OSDs can have a state of down+up >Supplementary knowledge: osd status
up: The daemon can provide IO services while it is running; down: The daemon is not running and cannot provide IO services; in: Contains data; out: does not contain data
# ceph osd tree |grep down 0 3.63129 osd.0 down 0 1.00000
This means that the osd.0 process is not running and does not contain data, where data is stored by the ceph cluster and can be validated.
-
First is the daemon
# systemctl status ceph-osd@0 ● ceph-osd@0.service - Ceph object storage daemon Loaded: loaded (/usr/lib/systemd/system/ceph-osd@.service; enabled; vendor preset: disabled) Active: failed (Result: start-limit) since Four 2017-04-06 09:26:04 CST; 1h 2min ago Process: 480723 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE) Process: 480669 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS) Main PID: 480723 (code=exited, status=1/FAILURE) 4 Month 06 09:26:04 bdc2 systemd[1]: Unit ceph-osd@0.service entered failed state. 4 Month 06 09:26:04 bdc2 systemd[1]: ceph-osd@0.service failed. 4 Month 06 09:26:04 bdc2 systemd[1]: ceph-osd@0.service holdoff time over, scheduling restart. 4 Month 06 09:26:04 bdc2 systemd[1]: start request repeated too quickly for ceph-osd@0.service 4 Month 06 09:26:04 bdc2 systemd[1]: Failed to start Ceph object storage daemon. 4 Month 06 09:26:04 bdc2 systemd[1]: Unit ceph-osd@0.service entered failed state. 4 Month 06 09:26:04 bdc2 systemd[1]: ceph-osd@0.service failed.
View logs for osd.0
# tail -f /var/log/ceph/ceph-osd.0.log 2017-04-06 09:26:04.531004 7f75f33d5800 0 filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342) 2017-04-06 09:26:04.531520 7f75f33d5800 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP ioctl is disabled via 'filestore fiemap' config option 2017-04-06 09:26:04.531528 7f75f33d5800 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' co nfig option 2017-04-06 09:26:04.531548 7f75f33d5800 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: splice is supported 2017-04-06 09:26:04.532318 7f75f33d5800 0 genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: syncfs(2) syscall fully supported (by glibc and kernel) 2017-04-06 09:26:04.532384 7f75f33d5800 0 xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: extsize is disabled by conf 2017-04-06 09:26:04.730841 7f75f33d5800 -1 filestore(/var/lib/ceph/osd/ceph-0) Error initializing leveldb : IO error: /var/lib/ceph/osd/ceph-0/current/omap/MANIFEST-004467: In put/output error 2017-04-06 09:26:04.730870 7f75f33d5800 -1 osd.0 0 OSD:init: unable to mount object store 2017-04-06 09:26:04.730879 7f75f33d5800 -1 ** ERROR: osd init failed: (1) Operation not permitted
Again, check the data
# cd /var/lib/ceph/osd/ceph-0/current # ls -lrt |tail -10 drwxr-xr-x 2 ceph ceph 58 4 February 200:45 4.2e9_TEMP drwxr-xr-x 2 ceph ceph 58 4 February 200:45 4.355_TEMP drwxr-xr-x 2 ceph ceph 58 4 February 200:45 4.36c_TEMP drwxr-xr-x 2 ceph ceph 58 4 February 200:45 4.3ae_TEMP drwxr-xr-x 2 ceph ceph 58 4 February 200:46 4.3b2_TEMP drwxr-xr-x 2 ceph ceph 58 4 February 200:46 4.3e8_TEMP drwxr-xr-x 2 ceph ceph 58 4 February 200:46 4.3ea_TEMP -rw-r--r--. 1 ceph ceph 10 4 February 08:53 commit_op_seq drwxr-xr-x. 2 ceph ceph 349 4 May 10:01 omap -rw-r--r--. 1 ceph ceph 0 4 June 0609:26 nosnap
Select two PGs at will for viewing
# ceph pg dump|grep 4.3ea dumped all in format plain 4.3ea 2 0 0 0 0 8388608 254 254 active+clean2017-04-06 01:55:04.754593 1322'2543132:122[26,2,12] 26[ 26,2,12]26 1322'2542017-04-06 01:55:04.754546 1322'2542017-04-02 00:46:12.611726 # ceph pg dump|grep 4.3e8 dumped all in format plain 4.3e8 1 0 0 0 0 4194304 12261226active+clean2017-04-06 01:26:43.827061 1323'1226 3132:127[2,15,5]2 [ 2,15,5] 2 1323'1226 2017-04-06 01:26:43.827005 1323'1226 2017-04-06 01:26:43.827005 ```
You can see that the three copies of 4.3ea very 4.3e8 are on [26,2,12] and [2,15,5] osd s, respectively
-
Summary:
>Clearly, the ceph cluster is now in a normal state, has kicked osd.0 out of the cluster, the daemon of osd.0 cannot start, restarting the service will report errors in the log (solution unknown), and the data stored in it will not be in the ceph cluster anymore. That is to say, this OSD is chicken ribs now, so we plan to kick it out completelyExit the cluster, scrub (zap), and rejoin the cluster.
2. Remove osd
- Move Out of Cluster (Manage Node Execution)
# ceph osd out 0 (REWEIGHT value becomes 0 in ceph osd tree)
- Stop service (target node execution)
# systemctl stop ceph-osd@0 (in CEPH OSD tree, state changes to DOWN)
The osd is already out and down, skipping 1,2 steps
- Remove crush
# ceph osd crush remove osd.0
- Delete key
# ceph auth del osd.0
- Remove osd
# ceph osd rm 0
- Uninstall mounted directories
# df -h |grep ceph-0 /dev/sdc1 3.7T 265G 3.4T 8% /var/lib/ceph/osd/ceph-0 # umount /var/lib/ceph/osd/ceph-0 umount: /var/lib/ceph/osd/ceph-0: Target busy. (In some cases, by lsof(8) or fuser(1) Find useful information about the process using the device)
Tips cannot be uninstalled, use the fuser command to see what's occupied
# fuser -m -v /var/lib/ceph/osd/ceph-0 User Process Number Permission Command /var/lib/ceph/osd/ceph-0: root kernel mount /var/lib/ceph/osd/ceph-0 root 212444 ..c.. bash
Terminate this occupied bash process
# kill -9 212444 Or use fuser to kill # fuser -m -v -i -k /var/lib/ceph/osd/ceph-0
Re-uninstall
# umount /var/lib/ceph/osd/ceph-0
Uninstall Successful
7. Erase Disks When df-h above, you already see that ceph-0 corresponds to / dev/sdc (you can also use the ceph-disk list to view it)
# ceph-disk zap /dev/sdc Caution: invalid backup GPT header, but valid main header; regenerating backup header from main header. **************************************************************************** Caution: Found protective or hybrid MBR and corrupt GPT. Using GPT, but disk verification and recovery are STRONGLY recommended. **************************************************************************** GPT data structures destroyed! You may now partition the disk using fdisk or other utilities. Creating new GPT entries. The operation has completed successfully.
View the ceph status at this time
# ceph -s cluster e6ccdfaa-a729-4638-bcde-e539b1e7a28d health HEALTH_WARN 170 pgs backfill_wait 10 pgs backfilling 362 pgs degraded 362 pgs recovery_wait 436 pgs stuck unclean recovery 5774/2136302 objects degraded (0.270%) recovery 342126/2136302 objects misplaced (16.015%) monmap e1: 3 mons at {bdc2=172.16.251.2:6789/0,bdc3=172.16.251.3:6789/0,bdc4=172.16.251.4:6789/0} election epoch 82, quorum 0,1,2 bdc2,bdc3,bdc4 osdmap e3142: 26 osds: 26 up, 26 in; 180 remapped pgs flags sortbitwise pgmap v13264634: 4096 pgs, 4 pools, 2558 GB data, 639 kobjects 7651 GB used, 89029 GB / 96680 GB avail 5774/2136302 objects degraded (0.270%) 342126/2136302 objects misplaced (16.015%) 3554 active+clean 362 active+recovery_wait+degraded 170 active+remapped+wait_backfill 10 active+remapped+backfilling recovery io 354 MB/s, 89 objects/s client io 1970 kB/s wr, 0 op/s rd, 88 op/s wr
Wait for cluster recovery to end and return to OK state before adding a new OSD.(I don't know why I have to wait until OK to add it.)
3. Adding osd
The disks corresponding to the uninstalled osd have been erased above. It is convenient to add them here, just use the ceph-deploy tool to add them directly.
# ceph-deploy --overwrite-conf osd create bdc2:/dev/sdc
At the end of the command execution, you can see that the osd has been added again, and the id is 0
# df -h |grep ceph-0 /dev/sdc1 3.7T 74M 3.7T 1% /var/lib/ceph/osd/ceph-0
Initially fresh, not much space used
# ceph-disk list |grep osd /dev/sdc1 ceph data, active, cluster ceph, osd.0, journal /dev/sdc2 /dev/sdd1 ceph data, active, cluster ceph, osd.1, journal /dev/sdd2 /dev/sde1 ceph data, active, cluster ceph, osd.2, journal /dev/sde2 /dev/sdf1 ceph data, active, cluster ceph, osd.3, journal /dev/sdf2
Review the ceph status at this time, restore to 27 osd s, wait until the cluster returns to ok state
# ceph -s cluster e6ccdfaa-a729-4638-bcde-e539b1e7a28d health HEALTH_WARN 184 pgs backfill_wait 6 pgs backfilling 374 pgs degraded 374 pgs recovery_wait 83 pgs stuck unclean recovery 4605/2114056 objects degraded (0.218%) recovery 298454/2114056 objects misplaced (14.118%) monmap e1: 3 mons at {bdc2=172.16.251.2:6789/0,bdc3=172.16.251.3:6789/0,bdc4=172.16.251.4:6789/0} election epoch 82, quorum 0,1,2 bdc2,bdc3,bdc4 osdmap e3501: 27 osds: 27 up, 27 in; 190 remapped pgs flags sortbitwise pgmap v13275552: 4096 pgs, 4 pools, 2558 GB data, 639 kobjects 7647 GB used, 92751 GB / 100398 GB avail 4605/2114056 objects degraded (0.218%) 298454/2114056 objects misplaced (14.118%) 3532 active+clean 374 active+recovery_wait+degraded 184 active+remapped+wait_backfill 6 active+remapped+backfilling recovery io 264 MB/s, 67 objects/s client io 1737 kB/s rd, 63113 kB/s wr, 60 op/s rd, 161 op/s wr ``` //Reference link: [http://www.cnblogs.com/sammyliu/p/555555218.html] (http://www.cnblogs.com/sammyliu/p/5555218.html)