ceph cluster construction
ceph
-
ceph is called future oriented storage
-
Chinese Manual:
- https://access.redhat.com/documentation/zh-cn/red_hat_ceph_storage/5/html/architecture_guide/index
- http://docs.ceph.org.cn/
-
Storage methods that ceph can realize:
- Block storage: provide storage like ordinary hard disk and provide "hard disk" for users
- File system storage: similar to NFS sharing, it provides users with shared folders
- Object storage: like Baidu cloud disk, you need to use a separate client
-
ceph is also a distributed storage system, which is very flexible. If capacity expansion is needed, just add servers to ceph.
-
ceph stores data in the form of multiple copies. In the production environment, at least three copies of a file should be stored. ceph is also a three copy storage by default.
Composition of ceph
- Ceph OSD daemon: Ceph OSD is used to store data. In addition, Ceph OSD uses the CPU, memory and network of Ceph node to perform data replication, code erasure, rebalancing, recovery, monitoring and reporting functions. If a storage node has several hard disks for storage, the node will have several osd processes.
- Ceph Mon monitor: Ceph Mon maintains the primary replica of Ceph storage cluster mapping and the current state of Ceph storage cluster. Monitors need a high degree of consistency to ensure agreement on Ceph storage cluster status. Maintain various charts showing cluster status, including monitor chart, OSD chart, homing group (PG) chart, and CRUSH chart.
- MDSs: Ceph metadata server (MDS) stores metadata for Ceph file system.
- RGW: object storage gateway. It mainly provides API interface for software accessing ceph.
Build ceph cluster
- Node preparation
host name | IP address |
---|---|
node1 | 172.18.10.11/24 |
node2 | 172.18.10.12/24 |
node3 | 172.18.10.13/24 |
client1 | 172.18.10.10/24 |
- Shut down and add two 20GB hard disks for node1-node3 respectively
# Check the added hard disk and pay attention to the name of the hard disk [root@node1 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 20G 0 disk ├─sda1 8:1 0 1G 0 part /boot └─sda2 8:2 0 19G 0 part ├─centos-root 253:0 0 17G 0 lvm / └─centos-swap 253:1 0 2G 0 lvm [SWAP] sdb 253:16 0 20G 0 disk sdc 253:32 0 20G 0 disk
- node1 mounts centos images. Build vsftpd service
mount /dev/cdrom /media cd /media/Packages rpm -ivh vsftpd*.rpm # Start ftp service systemctl enable vsftpd --now umount /media mkdir /var/ftp/centos cat >> /etc/fstab<<eof /dev/cdrom /var/ftp/centos iso9660 loop 0 0 eof # Upload CEPH ISO related images to the / var/ftp directory cd ~ tar xvf ceph-repo.tar.gz -C /var/ftp/ # Configure yum local source cd /etc/yum.repos.d/ rm -rf * cat >> ftp.repo<<eof [centos] name=centos baseurl=ftp://172.18.10.11/centos enabled=1 gpgcheck=0 [ceph-repo] name=ceph-repo baseurl=ftp://172.18.10.11/ceph-repo enabled=1 gpgcheck=0 eof # Check whether the setup is successful yum makecache
- Configure yum local source on other nodes
# Operate on node1 node # Configure hosts file cat >> /etc/hosts<<eof 172.18.10.10 client 172.18.10.11 node1 172.18.10.12 node2 172.18.10.13 node3 eof # Four nodes do secret free login # ceph provides us with a ceph deploy tool, which can uniformly operate all nodes on a node # Take node1 as the deployment node, and all future operations will be carried out on node1. In this way, node1 needs to be able to operate other hosts without secret ssh-keygen for i in client node{1..3} do ssh-copy-id $i done # Just press yes and enter the password # Pass the yum local source of node1 to the other three nodes cd /etc/yum.repos.d/ for i in client node{2..3} do scp ftp.repo $i:$PWD done #If no password is entered here, it indicates that the password free operation in the previous step is successful
-
Each node must turn off selinux and firewall
-
Preparations before cluster installation
- Install cluster ```shell # Install packages on 3 nodes [root@node1 ~]# for i in node{1..3} > do > ssh $i yum install -y ceph-mon ceph-osd ceph-mds ceph-radosgw > done # Configure client1 as ntp server [root@client1 ~]# yum install -y chrony [root@client1 ~]# vim /etc/chrony.conf 29 allow 172.18.10.0/24 # Authorize 192.168.4.0/24 to synchronize the clock 33 local stratum 10 # Provide time for other hosts even if the clock is not synchronized from one source [root@client1 ~]# systemctl restart chronyd # Configure node1-3 to become the NTP client of client1 [root@node1 ~]# for i in node{1..3} > do > ssh $i yum install -y chrony > done [root@node1 ~]# vim /etc/chrony.conf # Change line 7 only 7 server 172.18.10.10 iburst # Replace gateway [root@node1 ~]# for i in node{2..3} > do > scp /etc/chrony.conf $i:/etc/ > done [root@node1 ~]# for i in node{1..3} > do > ssh $i systemctl restart chronyd > done # Verify whether the time is synchronized. A ^ * in front of client1 indicates that the synchronization is successful [root@node1 ~]# chronyc sources -v ... ... ^* client1 10 6 17 40 -4385ns[-1241us] +/- 162us # Install CEPH deploy deployment tool on node1 [root@node1 ~]# yum install -y ceph-deploy # View usage help [root@node1 ~]# ceph-deploy --help [root@node1 ~]# ceph-deploy mon --help # View help for mon subcommand # Create CEPH deploy working directory [root@node1 ~]# mkdir ceph-cluster [root@node1 ~]# cd ceph-cluster # Create a new cluster. [root@node1 ceph-cluster]# ceph-deploy new node{1..3} [root@node1 ceph-cluster]# ls ceph.conf ceph-deploy-ceph.log ceph.mon.keyring [root@node1 ceph-cluster]# tree . . ├── ceph.conf # Cluster configuration file ├── ceph-deploy-ceph.log # log file └── ceph.mon.keyring # Shared key # Enable the COW layered snapshot function. COW: Copy On Write [root@node1 ceph-cluster]# vim ceph.conf # Add a line at the end as follows rbd_default_features = 1 # Initialize monitor [root@node1 ceph-cluster]# ceph-deploy mon create-initial [root@node1 ceph-cluster]# systemctl status ceph-mon* ● ceph-mon@node1.service .. .. [root@node2 ~]# systemctl status ceph* ● ceph-mon@node2.service ... ... [root@node3 ~]# systemctl status ceph* ● ceph-mon@node3.service ... ... # Note: these services can only be started three times within 30 minutes, and an error will be reported if they are exceeded. # View cluster status [root@node1 ceph-cluster]# ceph -s health HEALTH_ERR # Because there is no hard disk, the status is HEALTH_ERR # Create OSD [root@node1 ceph-cluster]# ceph-deploy disk --help # Initialize the hard disk of each host. vmware should be sdb and sdc [root@node1 ceph-cluster]# ceph-deploy disk zap node1:sdb node1:sdc [root@node1 ceph-cluster]# ceph-deploy disk zap node2:sdb node2:sdc [root@node1 ceph-cluster]# ceph-deploy disk zap node3:sdb node3:sdc # Create storage space. The hard disk of ceph is divided into two partitions. The first partition is 5GB in size, which is used to save the internal resources of ceph; The other zone is all the remaining space [root@node1 ceph-cluster]# ceph-deploy osd --help [root@node1 ceph-cluster]# ceph-deploy osd create node1:sd{b,c} [root@node1 Packages]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 20G 0 disk ├─sda1 8:1 0 1G 0 part /boot └─sda2 8:2 0 19G 0 part ├─centos-root 253:0 0 17G 0 lvm / └─centos-swap 253:1 0 2G 0 lvm [SWAP] sdb 8:16 0 20G 0 disk ├─sdb1 8:17 0 15G 0 part /var/lib/ceph/osd/ceph-0 └─sdb2 8:18 0 5G 0 part sdc 8:32 0 20G 0 disk ├─sdc1 8:33 0 15G 0 part /var/lib/ceph/osd/ceph-1 └─sdc2 8:34 0 5G 0 part sr0 11:0 1 8.8G 0 rom /media sr1 11:1 1 284M 0 rom /ceph # There will be two osd processes because there are two hard disks for ceph [root@node1 ceph-cluster]# systemctl status ceph-osd* # Continue to initialize the OSD of other nodes [root@node1 ceph-cluster]# ceph-deploy osd create node2:sd{b,c} [root@node1 ceph-cluster]# ceph-deploy osd create node3:sd{b,c} # View cluster status [root@node1 ceph-cluster]# ceph -s health HEALTH_OK # The status is HEALTH_OK means everything is OK
Implement block storage
- When a block device accesses data, it can access many at one time. Character devices can only be character streams
[root@node1 ceph-cluster]# ll /dev/sda brw-rw---- 1 root disk 253, 0 11 April 10:15 /dev/sda # b stands for block, block device [root@node1 ceph-cluster]# ll /dev/tty crw-rw-rw- 1 root tty 5, 0 11 April 10:54 /dev/tty # c stands for character, character device
- Block storage can provide devices like hard disks. When a node using block storage is connected to a block device for the first time, the block device needs to be partitioned, formatted, and then mounted for use.
- Storage pool is required when ceph provides storage. In order to provide storage resources to clients, you need to create a container named storage pool. Storage pools are similar to volume groups in logical volume management. The volume group contains many hard disks and partitions; The storage pool contains the hard disks on each node.
# By default, ceph has a storage pool named rbd, whose number is 0 [root@node1 ceph-cluster]# ceph osd lspools 0 rbd, # View storage pool size [root@node1 ceph-cluster]# ceph df GLOBAL: SIZE AVAIL RAW USED %RAW USED 92093M 91889M 203M 0.22 POOLS: NAME ID USED %USED MAX AVAIL OBJECTS rbd 0 16 0 30629M 3 # View the number of copies saved when the storage pool rbd stores data [root@node1 ceph-cluster]# ceph osd pool get rbd size size: 3 # In the default storage pool, create an image named demo image with a size of 10G and provide it to clients # Mirroring is equivalent to lv in logical volume management [root@node1 ceph-cluster]# rbd create demo-image --size 10G # View mirrors in the default storage pool [root@node1 ceph-cluster]# rbd list demo-image # View the details of demo image [root@node1 ceph-cluster]# rbd info demo-image rbd image 'demo-image': size 10240 MB in 2560 objects order 22 (4096 kB objects) block_name_prefix: rbd_data.1035238e1f29 format: 2 features: layering flags: # Capacity expansion [root@node1 ceph-cluster]# rbd resize --size 15G demo-image Resizing image: 100% complete...done. [root@node1 ceph-cluster]# rbd info demo-image rbd image 'demo-image': size 15360 MB in 3840 objects # reduce [root@node1 ceph-cluster]# rbd resize --size 7G demo-image --allow-shrink [root@node1 ceph-cluster]# rbd info demo-image rbd image 'demo-image': size 7168 MB in 1792 objects
Client uses block device
- How to use it? Install software
- Where is the ceph cluster? Specify the cluster address through the configuration file
- jurisdiction. keyring file
# Install ceph client software [root@client1 ~]# yum install -y ceph-common # Copy the configuration file and key keyring file to the client [root@node1 ceph-cluster]# scp /etc/ceph/ceph.conf 192.168.4.10:/etc/ceph/ [root@node1 ceph-cluster]# scp /etc/ceph/ceph.client.admin.keyring 192.168.4.10:/etc/ceph/ # Client view image [root@client1 ~]# rbd list demo-image # Map the image provided by ceph to local [root@client1 ~]# rbd map demo-image /dev/rbd0 [root@client ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 20G 0 disk ├─sda1 8:1 0 1G 0 part /boot └─sda2 8:2 0 19G 0 part ├─centos-root 253:0 0 17G 0 lvm / └─centos-swap 253:1 0 2G 0 lvm [SWAP] rbd0 252:0 0 7G 0 disk # An extra 7GB hard disk [root@client1 ~]# ls /dev/rbd0 /dev/rbd0 # view map [root@client1 ~]# rbd showmapped id pool image snap device 0 rbd demo-image - /dev/rbd0 # use [root@client1 ~]# mkfs.xfs /dev/rbd0 [root@client1 ~]# mount /dev/rbd0 /mnt/ [root@client1 ~]# df -h /mnt/ file system Capacity used available used% Mount point /dev/rbd0 7.0G 33M 7.0G 1% /mnt