ceph cluster construction

ceph cluster construction


  • ceph is called future oriented storage

  • Chinese Manual:

    • https://access.redhat.com/documentation/zh-cn/red_hat_ceph_storage/5/html/architecture_guide/index
    • http://docs.ceph.org.cn/
  • Storage methods that ceph can realize:

    • Block storage: provide storage like ordinary hard disk and provide "hard disk" for users
    • File system storage: similar to NFS sharing, it provides users with shared folders
    • Object storage: like Baidu cloud disk, you need to use a separate client
  • ceph is also a distributed storage system, which is very flexible. If capacity expansion is needed, just add servers to ceph.

  • ceph stores data in the form of multiple copies. In the production environment, at least three copies of a file should be stored. ceph is also a three copy storage by default.

Composition of ceph

  • Ceph OSD daemon: Ceph OSD is used to store data. In addition, Ceph OSD uses the CPU, memory and network of Ceph node to perform data replication, code erasure, rebalancing, recovery, monitoring and reporting functions. If a storage node has several hard disks for storage, the node will have several osd processes.
  • Ceph Mon monitor: Ceph Mon maintains the primary replica of Ceph storage cluster mapping and the current state of Ceph storage cluster. Monitors need a high degree of consistency to ensure agreement on Ceph storage cluster status. Maintain various charts showing cluster status, including monitor chart, OSD chart, homing group (PG) chart, and CRUSH chart.
  • MDSs: Ceph metadata server (MDS) stores metadata for Ceph file system.
  • RGW: object storage gateway. It mainly provides API interface for software accessing ceph.

Build ceph cluster

  • Node preparation
host nameIP address
  • Shut down and add two 20GB hard disks for node1-node3 respectively
# Check the added hard disk and pay attention to the name of the hard disk
[root@node1 ~]# lsblk 
sda               8:0    0   20G  0 disk
├─sda1            8:1    0    1G  0 part /boot
└─sda2            8:2    0   19G  0 part
  ├─centos-root 253:0    0   17G  0 lvm  /
  └─centos-swap 253:1    0    2G  0 lvm  [SWAP]
sdb    253:16   0  20G  0 disk 
sdc    253:32   0  20G  0 disk 
  • node1 mounts centos images. Build vsftpd service
mount /dev/cdrom /media 
cd /media/Packages
rpm -ivh vsftpd*.rpm

# Start ftp service
systemctl enable vsftpd --now

umount /media
mkdir /var/ftp/centos
cat >> /etc/fstab<<eof
/dev/cdrom /var/ftp/centos iso9660 loop 0 0
# Upload CEPH ISO related images to the / var/ftp directory
cd ~
tar xvf ceph-repo.tar.gz -C /var/ftp/

# Configure yum local source
cd /etc/yum.repos.d/
rm -rf *
cat >> ftp.repo<<eof
# Check whether the setup is successful
yum makecache
  • Configure yum local source on other nodes
# Operate on node1 node
# Configure hosts file
cat >> /etc/hosts<<eof client node1 node2 node3

# Four nodes do secret free login
# ceph provides us with a ceph deploy tool, which can uniformly operate all nodes on a node
# Take node1 as the deployment node, and all future operations will be carried out on node1. In this way, node1 needs to be able to operate other hosts without secret
for i in client node{1..3}
ssh-copy-id $i
# Just press yes and enter the password

# Pass the yum local source of node1 to the other three nodes
cd /etc/yum.repos.d/
for i in client node{2..3}
scp ftp.repo $i:$PWD

#If no password is entered here, it indicates that the password free operation in the previous step is successful

  • Each node must turn off selinux and firewall

  • Preparations before cluster installation

- Install cluster

# Install packages on 3 nodes
[root@node1 ~]# for i in node{1..3}
> do
> ssh $i yum install -y ceph-mon ceph-osd ceph-mds ceph-radosgw
> done

# Configure client1 as ntp server
[root@client1 ~]# yum install -y chrony
[root@client1 ~]# vim /etc/chrony.conf 
 29 allow    # Authorize to synchronize the clock
 33 local stratum 10   # Provide time for other hosts even if the clock is not synchronized from one source
[root@client1 ~]# systemctl restart chronyd

# Configure node1-3 to become the NTP client of client1
[root@node1 ~]# for i in node{1..3}
> do
> ssh $i yum install -y chrony
> done
[root@node1 ~]# vim /etc/chrony.conf  # Change line 7 only
  7 server iburst   # Replace gateway
[root@node1 ~]# for i in node{2..3}
> do
> scp /etc/chrony.conf $i:/etc/
> done
[root@node1 ~]# for i in node{1..3}
> do
> ssh $i systemctl restart chronyd
> done

# Verify whether the time is synchronized. A ^ * in front of client1 indicates that the synchronization is successful
[root@node1 ~]# chronyc sources -v
... ...
^* client1                      10   6    17    40  -4385ns[-1241us] +/-  162us

# Install CEPH deploy deployment tool on node1
[root@node1 ~]# yum install -y ceph-deploy
# View usage help
[root@node1 ~]# ceph-deploy --help
[root@node1 ~]# ceph-deploy mon --help   # View help for mon subcommand

# Create CEPH deploy working directory
[root@node1 ~]# mkdir ceph-cluster
[root@node1 ~]# cd ceph-cluster

# Create a new cluster.
[root@node1 ceph-cluster]# ceph-deploy new node{1..3}
[root@node1 ceph-cluster]# ls
ceph.conf  ceph-deploy-ceph.log  ceph.mon.keyring
[root@node1 ceph-cluster]# tree .
├── ceph.conf               # Cluster configuration file
├── ceph-deploy-ceph.log    # log file
└── ceph.mon.keyring        # Shared key

# Enable the COW layered snapshot function. COW: Copy On Write
[root@node1 ceph-cluster]# vim ceph.conf   # Add a line at the end as follows
rbd_default_features = 1

# Initialize monitor
[root@node1 ceph-cluster]# ceph-deploy mon create-initial
[root@node1 ceph-cluster]# systemctl status ceph-mon*
● ceph-mon@node1.service .. ..
[root@node2 ~]# systemctl status ceph*
● ceph-mon@node2.service ... ...
[root@node3 ~]# systemctl status ceph*
● ceph-mon@node3.service ... ...
# Note: these services can only be started three times within 30 minutes, and an error will be reported if they are exceeded.

# View cluster status
[root@node1 ceph-cluster]# ceph -s
     health HEALTH_ERR   # Because there is no hard disk, the status is HEALTH_ERR

# Create OSD
[root@node1 ceph-cluster]# ceph-deploy disk --help
# Initialize the hard disk of each host. vmware should be sdb and sdc
[root@node1 ceph-cluster]# ceph-deploy disk zap node1:sdb node1:sdc 
[root@node1 ceph-cluster]# ceph-deploy disk zap node2:sdb node2:sdc 
[root@node1 ceph-cluster]# ceph-deploy disk zap node3:sdb node3:sdc 

# Create storage space. The hard disk of ceph is divided into two partitions. The first partition is 5GB in size, which is used to save the internal resources of ceph; The other zone is all the remaining space
[root@node1 ceph-cluster]# ceph-deploy osd --help
[root@node1 ceph-cluster]# ceph-deploy osd create node1:sd{b,c}
[root@node1 Packages]# lsblk
sda               8:0    0   20G  0 disk
├─sda1            8:1    0    1G  0 part /boot
└─sda2            8:2    0   19G  0 part
  ├─centos-root 253:0    0   17G  0 lvm  /
  └─centos-swap 253:1    0    2G  0 lvm  [SWAP]
sdb               8:16   0   20G  0 disk
├─sdb1            8:17   0   15G  0 part /var/lib/ceph/osd/ceph-0
└─sdb2            8:18   0    5G  0 part
sdc               8:32   0   20G  0 disk
├─sdc1            8:33   0   15G  0 part /var/lib/ceph/osd/ceph-1
└─sdc2            8:34   0    5G  0 part
sr0              11:0    1  8.8G  0 rom  /media
sr1              11:1    1  284M  0 rom  /ceph

# There will be two osd processes because there are two hard disks for ceph
[root@node1 ceph-cluster]# systemctl status ceph-osd*

# Continue to initialize the OSD of other nodes
[root@node1 ceph-cluster]# ceph-deploy osd create node2:sd{b,c}
[root@node1 ceph-cluster]# ceph-deploy osd create node3:sd{b,c}

# View cluster status
[root@node1 ceph-cluster]# ceph -s
     health HEALTH_OK     # The status is HEALTH_OK means everything is OK

Implement block storage

  • When a block device accesses data, it can access many at one time. Character devices can only be character streams
[root@node1 ceph-cluster]# ll /dev/sda
brw-rw---- 1 root disk 253, 0 11 April 10:15 /dev/sda
# b stands for block, block device
[root@node1 ceph-cluster]# ll /dev/tty
crw-rw-rw- 1 root tty 5, 0 11 April 10:54 /dev/tty
# c stands for character, character device
  • Block storage can provide devices like hard disks. When a node using block storage is connected to a block device for the first time, the block device needs to be partitioned, formatted, and then mounted for use.
  • Storage pool is required when ceph provides storage. In order to provide storage resources to clients, you need to create a container named storage pool. Storage pools are similar to volume groups in logical volume management. The volume group contains many hard disks and partitions; The storage pool contains the hard disks on each node.
# By default, ceph has a storage pool named rbd, whose number is 0
[root@node1 ceph-cluster]# ceph osd lspools 
0 rbd,

# View storage pool size
[root@node1 ceph-cluster]# ceph df
    SIZE       AVAIL      RAW USED     %RAW USED 
    92093M     91889M         203M          0.22 
    NAME     ID     USED     %USED     MAX AVAIL     OBJECTS 
    rbd      0        16         0        30629M           3 

# View the number of copies saved when the storage pool rbd stores data
[root@node1 ceph-cluster]# ceph osd pool get rbd size
size: 3

# In the default storage pool, create an image named demo image with a size of 10G and provide it to clients
# Mirroring is equivalent to lv in logical volume management
[root@node1 ceph-cluster]# rbd create demo-image --size 10G
# View mirrors in the default storage pool
[root@node1 ceph-cluster]# rbd list
# View the details of demo image
[root@node1 ceph-cluster]# rbd info demo-image
rbd image 'demo-image':
	size 10240 MB in 2560 objects
	order 22 (4096 kB objects)
	block_name_prefix: rbd_data.1035238e1f29
	format: 2
	features: layering

# Capacity expansion
[root@node1 ceph-cluster]# rbd resize --size 15G demo-image
Resizing image: 100% complete...done.
[root@node1 ceph-cluster]# rbd info demo-image
rbd image 'demo-image':
	size 15360 MB in 3840 objects

# reduce
[root@node1 ceph-cluster]# rbd resize --size 7G demo-image --allow-shrink
[root@node1 ceph-cluster]# rbd info demo-image
rbd image 'demo-image':
	size 7168 MB in 1792 objects

Client uses block device

  • How to use it? Install software
  • Where is the ceph cluster? Specify the cluster address through the configuration file
  • jurisdiction. keyring file
# Install ceph client software
[root@client1 ~]# yum install -y ceph-common

# Copy the configuration file and key keyring file to the client
[root@node1 ceph-cluster]# scp /etc/ceph/ceph.conf
[root@node1 ceph-cluster]# scp /etc/ceph/ceph.client.admin.keyring

# Client view image
[root@client1 ~]# rbd list

# Map the image provided by ceph to local
[root@client1 ~]# rbd map demo-image
[root@client ~]# lsblk
sda               8:0    0   20G  0 disk
├─sda1            8:1    0    1G  0 part /boot
└─sda2            8:2    0   19G  0 part
  ├─centos-root 253:0    0   17G  0 lvm  /
  └─centos-swap 253:1    0    2G  0 lvm  [SWAP]
rbd0   252:0    0   7G  0 disk    # An extra 7GB hard disk

[root@client1 ~]# ls /dev/rbd0 

# view map
[root@client1 ~]# rbd showmapped
id pool image      snap device    
0  rbd  demo-image -    /dev/rbd0 

# use
[root@client1 ~]# mkfs.xfs /dev/rbd0
[root@client1 ~]# mount /dev/rbd0 /mnt/
[root@client1 ~]# df -h /mnt/
file system        Capacity used available used% Mount point
/dev/rbd0       7.0G   33M  7.0G    1% /mnt

Keywords: Operation & Maintenance Ceph

Added by hex on Mon, 07 Feb 2022 20:31:21 +0200