kubernetes data persistence


Pod is composed of containers, and after the container is down or stopped, the data will be lost. This means that we have to consider the storage problem when we are doing Kubernetes cluster, and the storage volume is created for pod to save data. There are many types of storage volumes. We usually use four types: emptyDir, hostPath, NFS and cloud storage (ceph, glasterfs...).

1, emptyDir

The volume of emptyDir type is created when the pod is allocated to the node. kubernetes will automatically allocate a directory on the node, so it is not necessary to specify the corresponding directory file on the node of the host machine. The initial content of this directory is empty. When pod is removed from node, the data in emptyDir will be permanently deleted. emptyDir Volume is mainly used for temporary directories that do not need to be permanently saved by some applications

apiVersion: apps/v1
kind: Deployment
metadata:
  name: emptydir
spec:
  replicas: 1
  selector:
    matchLabels:
      app: emptydir
  template:
    metadata:
      labels:
        app: emptydir
    spec:
      containers:
        - name: busybox
          image: busybox
          imagePullPolicy: IfNotPresent
          command: ["/bin/sh","-c","while true;do echo $HOSTNAME-`date` >> /tmp/log ;sleep 1;done"]
          volumeMounts:
            - mountPath: /tmp
              name: emptydir
        - name: busybox1
          image: busybox
          imagePullPolicy: IfNotPresent
          command: ["/bin/sh","-c","while true;do echo $HOSTNAME-`date` >> /tmp/log ;sleep 1;done"]
          volumeMounts:
            - mountPath: /tmp
              name: emptydir
      volumes:
        - name: emptydir
          emptyDir: {}


# Application resources
[root@k8s-master ~]# kubectl apply -f emptydir.yaml

#View the running status of pod
[root@k8s-master ~]# kubectl get pods

# Go inside the container to view
[root@k8s-master ~]# kubectl exec -it emptydir-7ddfb976cc-b9j4x -- sh
/ # tail -f /tmp/log

# Enter another container again to view (???)
[root@k8s-master ~]# kubectl exec -it emptydir-7ddfb976cc-gh9lk  -- sh
/ # tail -f /tmp/log

Summary:

  • The temporary data storage volume is created and deleted together with Pod, and all the data in it is destroyed.
  • Very lightweight, mainly used to store temporary data or file sharing between containers in Pod.
  • It is mainly used to create an empty directory

2, hostPath

The hostPath type maps files or directories in the node File system to the pod. When using a storage volume of hostPath type, you can also set the type field. The supported types are File, directory, File, Socket, CharDevice and BlockDevice.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hostpath
spec:
  selector:
    matchLabels:
      app: hostpath
  template:
    metadata:
      labels:
        app: hostpath
    spec:
      containers:
        - name: mysql
          image: mysql:5.7
          imagePullPolicy: IfNotPresent
          env:
            - name: MYSQL_ROOT_PASSWORD
              value: "123456"
          volumeMounts:
            - mountPath: /var/lib/mysql # Container directory
              name: hostpath
      volumes:
        - name: hostpath
          hostPath:
            path: /data # If the node directory does not exist, it will be created automatically. If it does exist, it needs to be empty in advance

# Application resources
[root@k8s-master ~]# kubectl apply -f hostpath.yaml 
deployment.apps/hostpath created

# See which node the pod is created to
[root@k8s-master ~]# kubeclt get pods -o wide
-bash: kubeclt: command not found
[root@k8s-master ~]# kubectl get pods -o wide
NAME                      READY   STATUS    RESTARTS   AGE   IP             NODE        NOMINATED NODE   READINESS GATES
hostpath-f46c6679-45rqp   1/1     Running   0          23s   10.244.1.145   k8s-node2   <none>           <none


# You can see that the data in the / data directory of node 2 already exists
[root@k8s-node2 ~]# ls /data/
auto.cnf    ca.pem           client-key.pem  ibdata1      ib_logfile1  mysql               private_key.pem  server-cert.pem  sys
ca-key.pem  client-cert.pem  ib_buffer_pool  ib_logfile0  ibtmp1       performance_schema  public_key.pem   server-key.pem

The time in the container is synchronized with the local time

To synchronize the time in the container with the local time, you can use hostpath. For example, start a pod and run tomcat to ensure time synchronization

apiVersion: apps/v1
kind: Deployment
metadata:
  name: tomcat
spec:
  selector:
    matchLabels:
      app: tomcat

  template:
    metadata:
      labels:
        app: tomcat
    spec:
      imagePullSecrets:
        - name: myregistrykey
      containers:
        - name: tomcat
          image: harbor.bertwu.online/beijing/tomcat:latest
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - mountPath: /etc/localtime # Time file hung in container
              name: localtime
      volumes:
        - name: localtime
          hostPath:
            path: /etc/localtime # Local time file
[root@k8s-master ~]# kubectl apply -f tomcat.yaml 

[root@k8s-master ~]# kubectl get pods
NAME                      READY   STATUS    RESTARTS   AGE
tomcat-57bbf9778c-262mz   1/1     Running   0          6s

[root@k8s-master ~]# kubectl exec -it tomcat-57bbf9778c-262mz -- bash
# Check whether the time in the container is consistent with the local time
root@tomcat-57bbf9778c-262mz:/usr/local/tomcat# date
Sun Jan  2 13:11:58 CST 2022

3, NFS

NFS allows us to mount existing shares to our pod. Unlike emptyDir, when the pod is deleted, emptyDir will also be deleted. However, NFS will not be deleted, it is only de hung, which means that NFS allows us to process data in advance, and these data can be transferred between pods, and NFS can be hung by multiple pods for reading and writing at the same time.
1. Build nfs service
On a 10.0.0.32 host

yum -y install nfs-utils rpcbind

systemctl enable nfs
systemctl restart nfs

group add www -g 666
useradd www -g www -u 666 -s /sbin/nologin

Create data directory

mkdir /k8sdata
chown -R www.www /k8sdata

Add authorization

[root@nfs ~]# cat /etc/exports
/k8sdata 10.0.0.0/16(rw,sync,no_root_squash,anonuid=666,anongid=666)

systemctl restart nfs
  1. Install NFS utils on all k8s nodes
yum install nfs-utils rpcbind -y

Creating a pod using nfs

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nfs
spec:
  selector:
    matchLabels:
      app: nfs
  template:
    metadata:
      labels:
        app: nfs
    spec:
      containers:
        - name: mysql
          image: mysql:5.7
          imagePullPolicy: IfNotPresent
          env:
            - name: MYSQL_ROOT_PASSWORD
              value: "123456"
          volumeMounts:
            - mountPath: /var/lib/mysql # Directories in the container that need to be persisted
              name: nfs
      volumes:
        - name: nfs
          nfs:
            path: /k8sdata # nfs service segment shared directory
            server: 10.0.0.32 # nfs server address
# Application resources
[root@k8s-master ~]# kubectl apply -f nfs.yaml 
deployment.apps/nfs created

# View resources
[root@k8s-master ~]# kubectl get pods
NAME                      READY   STATUS    RESTARTS   AGE
nfs-9cf648dcf-k4kdd       1/1     Running   0          4s
#Check whether the data synchronization is successful on the nfs server

[root@nfs ~]# ls /k8sdata/
auto.cnf    ca.pem           client-key.pem  ibdata1      ib_logfile1  performance_schema  public_key.pem   server-key.pem
ca-key.pem  client-cert.pem  ib_buffer_pool  ib_logfile0  mysql        private_key.pem     server-cert.pem  sys

4, PV and PVC

Persistent volume (PV) is a segment of network storage that has been configured by the administrator in the cluster. A resource in a cluster is like a node, which is a cluster resource. PV is a volume plug-in such as a volume, but has a lifecycle independent of any single pod that uses PV. This API object captures the implementation details of storage, that is, NFS, iSCSI or cloud provider specific storage systems.

Persistent volume claim (PVC) is a user stored request. Usage logic of PVC: define a storage volume (the storage volume type is PVC) in the pod. When defining, directly specify the size. PVC must establish a relationship with the corresponding pv. PVC will apply for pv according to the definition, and pv is created from the storage space. pv and PVC are storage resources abstracted from kubernetes.

4.1 pv access modes

patternexplain
ReadWriteOnce(RWO)It is readable and writable, but only supports mounting by a single node (if the underlying storage allows, it can still mount multiple nodes).
ReadOnlyMany(ROX)It is read-only and can be mounted by multiple nodes (if the underlying storage allows it to write, it can still write).
ReadWriteMany(RWX)Multiple read and write. This storage can be shared by multiple nodes in a read-write manner. Not every kind of storage supports these three methods, such as sharing. At present, few are supported, and NFS is commonly used. When PVC binds PV, it is usually bound according to two conditions, one is the storage size, and the other is the access mode.

#Note: ceph supports ReadWriteOnce and readwritemany, and NFS supports the above three

4.2 persistent volume reclaimpolicy

strategyexplain
RetainDo not clean up, keep the Volume (manual cleaning is required). After unbinding, do nothing
RecycleDelete data, that is rm -rf /thevolume / * (only NFS and HostPath support). Release the binding state immediately after unbinding (obsolete)
DeleteDelete storage resources, such as AWS EBS volumes (only supported by AWS EBS, GCE PD, Azure Disk and Cinder). After unbinding, all data is deleted and the storage volume is deleted

4.3 pv status

stateexplain
AvailableAvailable.
BoundHas been assigned to PVC.
ReleasedThe PVC is unbound but the recycling policy has not been implemented.
FailedAn error occurred.

Create three PVS

[root@k8s-master ~]# cat pv.yaml 
apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs-v1
spec:
  nfs:
    path: /nfs/v1 # nfs server shared directory
    server: 10.0.0.32 # nfs service address
  accessModes:
    - "ReadWriteMany" # Access mode multiple read / write
  persistentVolumeReclaimPolicy: Retain # Recycling strategy, keep
  capacity:
    storage: 2Gi

---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs-v2
spec:
  nfs:
    path: /nfs/v2
    server: 10.0.0.32
  accessModes:
    - "ReadOnlyMany" # Access mode read only
  persistentVolumeReclaimPolicy: Retain
  capacity:
    storage: 2Gi


---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs-v3
spec:
  nfs:
    path: /nfs/v3
    server: 10.0.0.32
  accessModes:
    - "ReadWriteOnce" # Readable and writable, but only supported to be mounted by a single node
  persistentVolumeReclaimPolicy: Retain
  capacity:
    storage: 2Gi


# Application resources
[root@k8s-master ~]# kubectl apply -f pv.yaml 
persistentvolume/nfs-v1 unchanged
persistentvolume/nfs-v2 unchanged
persistentvolume/nfs-v3 created

# View the status of pv
[root@k8s-master ~]# kubectl get pv
NAME     CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS   REASON   AGE
nfs-v1   2Gi        RWX            Retain           Available                                   56s
nfs-v2   2Gi        ROX            Retain           Available                                   56s
nfs-v3   2Gi        RWO            Retain           Available                                   9s

2. Create pvc

[root@k8s-master ~]# cat pvc.yaml 
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: nfs-v1 # pvc name
spec:
  accessModes:  # Specify pvc access policy
    - "ReadWriteMany"
  resources:
    requests:
      storage: 2Gi #pvc size cannot exceed pv

pvc will automatically match the appropriate pv

[root@k8s-master ~]# kubectl get pvc 
NAME     STATUS   VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS   AGE
nfs-v1   Bound    nfs-v1   2Gi        RWX                           7s
[root@k8s-master ~]# 
[root@k8s-master ~]# kubectl get pv
NAME     CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM            STORAGECLASS   REASON   AGE
nfs-v1   2Gi        RWX            Retain           Bound       default/nfs-v1                           17m
nfs-v2   2Gi        ROX            Retain           Available                                            17m
nfs-v3   2Gi        RWO            Retain           Available                                            16m

Configuration list application pvc

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nfs-v1
spec:
  selector:
    matchLabels:
      app: nfs-v1
  template:
    metadata:
      labels:
        app: nfs-v1
    spec:
      containers:
        - name: nginx
          image: nginx
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - mountPath: /usr/share/nginx/html
              name: nfs-v1

      volumes:
        - name: nfs-v1
          persistentVolumeClaim:
            claimName: nfs-v1 # Specify which pvc to use

# Application resources
[root@k8s-master ~]# kubectl apply -f pvc-deployment.yaml 
[root@k8s-master ~]# kubectl get pods
NAME                      READY   STATUS    RESTARTS   AGE
nfs-v1-6479df84bc-9dzdd   1/1     Running   0          4s

# Enter pod
[root@k8s-master ~]# kubectl exec -it nfs-v1-6479df84bc-9dzdd -- bash
root@nfs-v1-6479df84bc-9dzdd:/# cd /usr/share/nginx/html/
root@nfs-v1-6479df84bc-9dzdd:/usr/share/nginx/html# ls

# create a file
root@nfs-v1-6479df84bc-9dzdd:/usr/share/nginx/html# touch 1.txt
root@nfs-v1-6479df84bc-9dzdd:/usr/share/nginx/html# ls
1.txt


# Check whether the directory file corresponding to the nfs server exists
[root@nfs nfs]# ll /nfs/v1/
total 0
-rw-r--r-- 1 root root 0 Jan  2 16:47 1.txt

You can also create pvc2, followed by pvc3

# pvc2
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: nfs-v2
spec:
  accessModes:
    - "ReadOnlyMany"
  resources:
    requests:
      storage: 2Gi
---

# pvc3
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: nfs-v3
spec:
  accessModes:
    - "ReadWriteOnce"
  resources:
    requests:
      storage: 2Gi

Conclusion: the access policy is just a convention. In fact, it can be read and written, which is determined by the underlying storage

Summary:
1.EmptyDir is an empty directory on the Host, which temporarily stores files and shares directories between containers in the same pod
As shared storage between containers. Once the pod is destroyed, it is finished, and the data will not be saved

2.HostPath is to mount an actual directory in the Node host into the Pod for use by the container. This design can ensure that the Pod is destroyed, but the data can be stored on the Node host to realize the persistent storage of files.
DirectoryOrCreate # directory is used when it exists. If it does not exist, it is created first and then used

3.Pv is the external unified interface of the underlying storage, and some space prepared by the system administrator in advance
PVC is the application given by the user, for example, how much hard disk space does the user need

5, StorageClass

In a large kubernetes cluster, there may be thousands of PVS, which means that the operation and maintenance personnel must create multiple PVS. In addition, with the needs of the project, new PVS will be submitted continuously, so the operation and maintenance personnel need to constantly add new PVS that meet the requirements, otherwise the creation of new pods will fail because the PVC cannot be bound to the PV. Moreover, a certain storage space requested through PVC may not be enough to meet the various requirements of applications for storage devices, and different applications may have different requirements for storage performance, such as read-write speed, concurrency performance, etc. in order to solve this problem, kubernetes has introduced a new resource object: StorageClass, Through the definition of StorageClass, administrators can define storage resources as certain types of resources, such as fast storage and slow storage. Kubernetes can intuitively know the specific characteristics of various storage resources according to the description of StorageClass, so that they can apply for appropriate storage resources according to the characteristics of the application.

StorageClass first provides the function of automatically creating storage resources. It can automatically create PV for us according to PVC. Therefore, you only need to write pod and PVC resource lists for verification.

Each storage class contains three parameter fields: provider, parameters and reclaimPolicy. When a PersistentVolume belonging to a class needs to be dynamically provided, the above parameter fields will be used.

Official website: https://helm.sh/
# Download helm
wget https://get.helm.sh/helm-v3.3.4-linux-amd64.tar.gz

# decompression
tar xf helm-v3.3.4-linux-amd64.tar.gz
# install
mv linux-amd64/helm /usr/local/bin/

# verification
helm version

Add warehouse

 helm repo add moikot https://moikot.github.io/helm-charts
 helm repo add ckotzbauer https://ckotzbauer.github.io/helm-
 helm repo add kaiyuanshe http://mirror.kaiyuanshe.cn/kubernetes/charts/

# Check whether the warehouse is added successfully
[root@k8s-master linux-amd64]# helm repo list
NAME      	URL                                           
moikot    	https://moikot.github.io/helm-charts          
ckotzbauer	https://ckotzbauer.github.io/helm-charts      
kaiyuanshe	http://mirror.kaiyuanshe.cn/kubernetes/charts/

# Search NFS client
root@k8s-master linux-amd64]# helm search repo  nfs-client
NAME                             	CHART VERSION	APP VERSION	DESCRIPTION                                       
ckotzbauer/nfs-client-provisioner	2.0.0        	3.1.0      	DEPRECATED - nfs-client is an automatic provisi...
kaiyuanshe/nfs-client-provisioner	1.2.11       	3.1.0      	DEPRECATED - nfs-client is an automatic provisi...
moikot/nfs-client-provisioner    	1.3.0        	3.1.0      	nfs-client is an automatic provisioner that us

# Pull image
root@k8s-master nfs-client]# helm pull kaiyuanshe/nfs-client-provisioner


 # see
[root@k8s-master nfs-client]# ls
nfs-client-provisioner  nfs-client-provisioner-1.2.11.tgz

# decompression
[root@k8s-master nfs-client]# tar xf nfs-client-provisioner-1.2.11.tgz 

[root@k8s-master nfs-client]# cd nfs-client-provisioner/
[root@k8s-master nfs-client-provisioner]# ls
Chart.yaml  ci  README.md  templates  values.yaml

# modify
[root@k8s-master nfs-client-provisioner]# vim values.yaml 
nfs:
  server: 10.0.0.32	# nfs server ip
  path: /nfs/v1		# nfs server configured mount directory
  
   accessModes: ReadWriteMany	# Change to multi-channel readable and writable (access policy)

# install
[root@k8s-master nfs-client-provisioner]# helm install nfs ./
NAME: nfs
LAST DEPLOYED: Sun Jan  2 21:28:01 2022
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None

[root@k8s-master nfs-client-provisioner]# helm list
NAME	NAMESPACE	REVISION	UPDATED                                	STATUS  	CHART                        	APP VERSION
nfs 	default  	1       	2022-01-02 21:28:01.578229377 +0800 CST	deployed	nfs-client-provisioner-1.2.11	3.1.0      
[root@k8s-master nfs-client-provisioner]# 

# Check the startup of pod
[root@k8s-master nfs-client-provisioner]# kubectl get pods -o wide
NAME                                          READY   STATUS    RESTARTS   AGE   IP             NODE        NOMINATED NODE   READINESS GATES
nfs-nfs-client-provisioner-595777dd9d-xlz4s   1/1     Running   0          14m   10.244.1.152   k8s-node2   <none>           <none>


# Check whether the storageclass is started
[root@k8s-master ~]# kubectl get sc
NAME         PROVISIONER                                RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
nfs-client   cluster.local/nfs-nfs-client-provisioner   Delete          Immediate           true                   15m

Test StorageClass

# Prepare pvc resource list
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: test-sc
spec:
  storageClassName: nfs-client		# storageclass name
  accessModes:	# Access mode / policy
    - "ReadWriteMany"	# Multiplex read write
  resources: # resources
    requests: # request
      storage: 2Gi	# The storage size is only displayed, and will not actually limit the file storage size, as long as the hard disk can fit
---
# Create password secret list
kind: Secret
apiVersion: v1
metadata:
  name: test
data:
  MYSQL_ROOT_PASSWORD: MTIzNDU2   # The encrypted password is specified here
---
# Compile pod resource list
kind: Deployment
apiVersion: apps/v1
metadata:
  name: mysql
spec:
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
        - name: mysql
          image: mysql:5.7
          imagePullPolicy: IfNotPresent
          envFrom:                  # Specify environment variables
            - secretRef:           # Define an item in secret as an internal environment variable
                name: test          # Name of secret
                optional: true      # Specifies whether it must be defined

          volumeMounts:
            - mountPath: /var/lib/mysql
              name: nfs
      volumes:
        - name: nfs
          persistentVolumeClaim:	# Hanging mode pvc type
            claimName: test-sc
#View automatically created PVS
[root@k8s-master ~]# kubectl get pv
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM             STORAGECLASS   REASON   AGE
pvc-c7a49ab0-4337-42d9-8a4b-07894ed03bcc   2Gi        RWX            Delete           Bound    default/test-sc   nfs-client              8s

# View manually created pvc
[root@k8s-master ~]# kubectl get pvc
NAME      STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
test-sc   Bound    pvc-c7a49ab0-4337-42d9-8a4b-07894ed03bcc   2Gi        RWX            nfs-client     47s

# View nfs server
[root@nfs k8sdata]# ls
default-test-sc-pvc-c7a49ab0-4337-42d9-8a4b-07894ed03bcc


[root@nfs k8sdata]# cd default-test-sc-pvc-c7a49ab0-4337-42d9-8a4b-07894ed03bcc/

#mysql data has been mounted to the server
[root@nfs default-test-sc-pvc-c7a49ab0-4337-42d9-8a4b-07894ed03bcc]# ls
auto.cnf    ca.pem           client-key.pem  ibdata1      ib_logfile1  mysql               private_key.pem  server-cert.pem  sys
ca-key.pem  client-cert.pem  ib_buffer_pool  ib_logfile0  ibtmp1       performance_schema  public_key.pem   server-key.pem

Keywords: Docker Kubernetes Container

Added by jhbalaji on Tue, 04 Jan 2022 08:47:17 +0200