Implementation of exporting snapshot data to object storage using Velero

Velero (formerly Heptio Ark) is an open source tool that can safely backup and restore, perform disaster recovery, and migrate Kubernetes cluster resources and persistent volumes. Velero can be deployed in a self built Kubernetes cluster or a K8S environment hosted by a public cloud, such as QKE (Kubesphere). Velero can be used to:

  1. Backup cluster resources
  2. And restore in case of loss.
  3. Migrate cluster resources to other clusters.
  4. Copy the production cluster resources to the development and test cluster.

Velero has two backup methods:

  1. Backup in restic mode, which backs up persistent volume data at the file system level and sends it to Velero's object store. The execution speed depends on the local IO capability, network loan and object storage performance, which is slower than snapshot backup. However, if there is a problem with the current cluster or storage, because all resources and data are stored on the remote object storage, the application can be easily restored by using restic backup.
  2. For snapshot backup, Velero uses a set of BackupItemAction plug-ins to back up PersistentVolumeClaims. Fast execution speed. It creates a VolumeSnapshot object with PersistentVolumeClaim as the source This VolumeSnapshot object is in the same namespace as the PersistentVolumeClaim used as the source. The volumesnaphotcontent object corresponding to VolumeSnapshot is a cluster wide resource that points to the actual disk based snapshot in the storage system. During Velero backup, all VolumeSnapshots and volumesnaphotcontents objects are uploaded to the object storage system, but the data resources after Velero backup are still saved on the storage of the cluster. Data availability depends on the high availability of local storage, because if the application problem is caused by storage failure, Velero's snapshot backup mechanism cannot recover the application data.

Aiming at the limitation of velero snapshot backup, this paper will manually store the backup applications and data to AWS compatible S3 objects, such as minio in private environment or QingStor in Qingyun on public cloud. Here, QingStor is taken as an example.

This experiment will deploy a wordpress application in the wordpress project (namespace) of Kubesphere cluster. First, use Velero snapshot to back up the applications and data under this namespace to QingStor, manually export the data resources from the primary storage to QingStor, and then simulate the failure of the primary storage, recover the data resources from QingStor and apply them to another cluster. The following are the specific experimental steps:

Experimental environment and preconditions:

Install the Velero open source tool and configure the corresponding object storage. Create wordpress namespace based on rook CEPH, and run wordpress and mysql applications

root$ kubectl -n wordpress get pvc
NAME             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
mysql-pv-claim   Bound    pvc-8a3b3541-1718-4af5-94fc-e24ebe026172   10Gi       RWO            rook-ceph-block   47m
wp-pv-claim      Bound    pvc-355352e5-0ecf-4b40-9056-5514015eb392   2Gi        RWO            rook-ceph-block   47m
root$ kubectl -n wordpress get pods
NAME                              READY   STATUS    RESTARTS   AGE
wordpress-589f976cd5-4ns55        1/1     Running   0          45m
wordpress-mysql-d9b8d8884-2kmtb   1/1     Running   0          45m

Backup data

  1. In order to prove that the data can be recovered, first publish a new article on wordpress, and then check whether the article has been recovered after the whole backup and recovery process.

<img src="https://gitee.com/jibutech/tech-docs/raw/master/images/wordpress-demo.png" style="zoom:50%;" />

  1. Use velero to make a snapshot backup of the workpress project. The CR used when we create a velero backup under the namespace velero where velero runs.
# wp-snap-manual.yaml
apiVersion: velero.io/v1
kind: Backup
metadata:
  annotations:
    velero.io/source-cluster-k8s-gitversion: v1.19.5
    velero.io/source-cluster-k8s-major-version: "1"
    velero.io/source-cluster-k8s-minor-version: "19"
  namespace: velero
  name: wp-snap-manual
spec:
  defaultVolumesToRestic: false
  hooks: {}
  snapshotVolumes: true
  includedNamespaces:
  - wordpress
  storageLocation: qingstor-vbbf8
  volumeSnapshotLocations:
  - qingstor-bd0a9b2b-7add-4b97-ba26-d8182d1a2d8e
  ttl: 2h0m0s

Create a backup velero. io CR

root$ kubectl apply -f wp-snap-manual.yaml
backup.velero.io/wp-snap-manual created  

You can view the generated volumesnapshot resource under wordpress namespace and view the corresponding volumesnapshotcontent information

root$ kubectl -n wordpress get volumesnapshot
NAME                          AGE
velero-mysql-pv-claim-hmthh   58m
velero-wp-pv-claim-lgmh5      58m

root$  kubectl -n wordpress get volumesnapshot velero-mysql-pv-claim-hmthh -o yaml | grep bound
  boundVolumeSnapshotContentName: snapcontent-428c9f1d-69e1-46b0-93d5-dac44b795aaa
root$  kubectl -n wordpress get volumesnapshot velero-wp-pv-claim-lgmh5 -o yaml | grep bound
  boundVolumeSnapshotContentName: snapcontent-6f2ca29b-75a5-46b4-89a3-2f7a4eeff958
  1. Delete volumesnapshot under Wordpress namespace
root$ kubectl -n wordpress delete volumesnapshot velero-mysql-pv-claim-hmthh velero-wp-pv-claim-lgmh5
volumesnapshot.snapshot.storage.k8s.io "velero-mysql-pv-claim-hmthh" deleted
volumesnapshot.snapshot.storage.k8s.io "velero-wp-pv-claim-lgmh5" deleted
  1. Create a new namespace poc. Create a volumesnapshot in poc. The volumeSnapshotContentName of source in spec is the volumeSnapshotContentName in step 2
# velero-wp-snapshot.yaml
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshot
metadata:
finalizers:
- snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection
labels:
velero.io/backup-name: wp-snap-manual
manager: snapshot-controller
name: velero-wp-snapshot
namespace: poc
spec:
	source:
		volumeSnapshotContentName: snapcontent-6f2ca29b-75a5-46b4-89a3-2f7a4eeff958
		volumeSnapshotClassName: csi-rbdplugin-snapclass
# velero-mysql-snapshot.yaml
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshot
metadata:
finalizers:
- snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection
labels:
velero.io/backup-name: wp-snap-manual
manager: snapshot-controller
name: velero-mysql-snapshot
namespace: poc
spec:
	source:
		volumeSnapshotContentName: snapcontent-428c9f1d-69e1-46b0-93d5-dac44b795aaa
		volumeSnapshotClassName: csi-rbdplugin-snapclass
root$ kubectl create ns poc
root$ kubectl -n poc apply -f velero-mysql-snapshot.yaml -f velero-wp-snapshot.yaml
  1. Change the yaml of two volumesnapshotcontent s so that their volumeSnapshotRef points to the volumesnapshot of the new namespace
root$ kubectl edit volumesnapshotcontent snapcontent-428c9f1d-69e1-46b0-93d5-dac44b795aaa

Find the VolumeSnapshotRef field and update it to the volumesnapshot content under poc

volumeSnapshotRef:
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshot
name: velero-wp-snapshot
namespace: poc
uid: 4c1a4a4a-9949-425a-a3a9-1970f494aaca
volumeSnapshotRef:
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshot
name: velero-mysql-snapshot
namespace: poc
uid: 4c1a4a4a-9949-4277-a3a9-1970f494aaff
  1. Create PVC in the new namespace and specify two volume snapshots of the current namespace of the data source
# mysql-pv-claim.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-pv-claim
spec:
  storageClassName: rook-ceph-block
  dataSource:
    name: velero-mysql-snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
# wp-pv-claim.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: wp-pv-claim
spec:
  storageClassName: rook-ceph-block
  dataSource:
    name: velero-wp-snapshot
    kind: VolumeSnapshot
    apiGroup: snapshot.storage.k8s.io
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
root$ kubectl -n poc apply -f wp-pv-claim.yaml -f mysql-pv-claim.yaml
root$ kubectl -n poc get pvc
NAME             STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
mysql-pv-claim   Bound    pvc-7fcf6a03-abd5-4692-92d5-63504891de57   10Gi       RWO            rook-ceph-block   4s
wp-pv-claim      Bound    pvc-a1234890-18a0-4d5a-9435-1b74632e8f17   2Gi        RWO            rook-ceph-block   4s
  1. After the PVC is successfully created, the cluster will create a new PV for them and update the callback policy of the PV to Retain. The purpose is to continue to Retain the two PVS after deleting the PVC so that the data can continue to be saved.
kubectl patch pv pvc-7fcf6a03-abd5-4692-92d5-63504891de57 -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}' --type=merge 
kubectl patch pv pvc-a1234890-18a0-4d5a-9435-1b74632e8f17  -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}' --type=merge 
  1. Delete PVC MySQL PV claim and WP PV claim in poc, and specify volumename as the name of the PV created above, so that the new PVC will be bound with the PV created above. Through the above steps (volume snapshot content - > PVC - > PV - > pvc2), the snapshot data of wordpress is completely consistent with the data resources specified by the current PVC.
# mysql-pv-claim-2.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-pv-claim
spec:
  storageClassName: rook-ceph-block
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  volumeName: pvc-89b028cc-c5c1-4c63-9398-05f54c80860a
# wp-pv-claim-2.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: wp-pv-claim
spec:
  storageClassName: rook-ceph-block
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  volumeName: pvc-1d985bd1-d03d-40e6-b1cc-aaf2deb1d403
root$ kubectl -n poc apply -f wp-pv-claim-2.yaml -f mysql-pv-claim-2.yaml
  1. Change the RECLAIM POLICY of two PV S back to Delete
kubectl patch pv pvc-7fcf6a03-abd5-4692-92d5-63504891de57 -p '{"spec":{"persistentVolumeReclaimPolicy":"Delete"}}' --type=merge 
kubectl patch pv pvc-a1234890-18a0-4d5a-9435-1b74632e8f17 -p '{"spec":{"persistentVolumeReclaimPolicy":"Delete"}}' --type=merge 
  1. Create a temporary pod in the new namespace and bind PVC.
root$ kubectl -n poc get pods
NAME                                          READY   STATUS    RESTARTS   AGE
stage-wordpress-589f976cd5-4ns55-krjxs        1/1     Running   0          56s
stage-wordpress-mysql-d9b8d8884-2kmtb-q9qnf   1/1     Running   0          56s
  1. Call Velero for file system level backup. After the backup is successful, all wordpress data resources have been exported to the remote QingStor
# poc-filesystem-manual.yaml
apiVersion: velero.io/v1
kind: Backup
metadata:
  annotations:
    velero.io/source-cluster-k8s-gitversion: v1.19.5
    velero.io/source-cluster-k8s-major-version: "1"
    velero.io/source-cluster-k8s-minor-version: "19"
  namespace: velero
  name: poc-filesystem-manual
spec:
  defaultVolumesToRestic: true
  hooks: {}
  snapshotVolumes: false
  includedNamespaces:
  - wordpress
  storageLocation: qingstor-vbbf8
  ttl: 2h0m0s
root$  kubectl -n velero apply -f poc-filesystem-manual.yaml
root$  kubectl -n velero get backups.velero.io poc-filesystem-manual
NAME                    AGE
poc-filesystem-manual   95s
root$  kubectl -n velero describe backups.velero.io poc-filesystem-manual
...
Status:
  Completion Timestamp:  2021-11-18T05:49:18Z
  Expiration:            2021-12-18T05:47:35Z
  Format Version:        1.1.0
  Phase:                 Completed
  Progress:
    Items Backed Up:  31
    Total Items:      31
  Start Timestamp:    2021-11-18T05:47:35Z
  Version:            1
Events:               <none>

Recover data

After the backup is successful, all the data has been transferred to the remote object storage. If there is a failure and storage failure at this time, we will describe how to recover the namespace and data from the remote object storage.

  1. Simulate the disaster state and delete the wordpress namespace
root$  kubectl delete ns wordpress
  1. Create Velero's restore CR, restore the data we backed up earlier to the wordpress namespace, and delete two temporary pod s after successful recovery.
# poc-restore-manual.yaml
apiVersion: velero.io/v1
kind: Restore
metadata:
  name: poc-restore
  namespace: velero
spec:
  backupName: poc-filesystem-manual
  excludedResources:
  - nodes
  - events
  - events.events.k8s.io
  - backups.velero.io
  - restores.velero.io
  - resticrepositories.velero.io
  hooks: {}
  namespaceMapping:
    poc: wordpress
  restorePVs: true
root$ kubectl -n velero apply -f poc-restore-manual.yaml
  1. Then use Velero to restore the CR and other resources of wordpress snapshot backup to the wordpress namespace. exclude PV and PVC resources on recovery. Wait for the pod to run.
# wp-restore-manual.yaml
apiVersion: velero.io/v1
kind: Restore
metadata:
  name: poc-restore
  namespace: velero
spec:
  backupName: wp-snapshot-manual
  excludedResources:
  - nodes
  - events
  - events.events.k8s.io
  - backups.velero.io
  - restores.velero.io
  - resticrepositories.velero.io
  - persistentvolume
  - persistentvolumeclaim
  hooks: {}
  namespaceMapping:
    wordpress: wordpress
  restorePVs: true
root$ kubectl -n velero apply -f wp-restore-manual.yaml
root$ kubectl -n wordpress get pods
NAME                                          READY   STATUS     RESTARTS   AGE
wordpress-mysql-d9b8d8884-2kmtb               1/1     Running             0          18s
wordpress-589f976cd5-4ns55                    1/1     Running             0          18s
  1. Now we can verify whether all wordpress data is recovered. Open wordpress and you can see that the article in the figure below still exists. So far, it indicates that the recovery is successful.

<img src="https://gitee.com/jibutech/tech-docs/raw/master/images/wordpress-demo-2.png" style="zoom:50%;" />

automation

For the backup and recovery process, the author has written a small tool to automate the above manual process. You are welcome to try it data-mover And put forward valuable suggestions.

The following is the program run output:

1. Backup data

root data-mover % go run main.go --action backup --backupName wp-backup-snap-76mxp-hzb2f --namespace wordpress
=== Step 0. Create temporay namespace + dm-wp-backup-snap-76mxp-hzb2f
=== Step 1. Create new volumesnapshot in temporary namespace
name: velero-mysql-pv-claim-q6jgv, uid: 532b6050-1bd7-4a6f-abfc-1a900bb52fc1, pvc: mysql-pv-claim, content_name: snapcontent-532b6050-1bd7-4a6f-abfc-1a900bb52fc1
Deleted volumesnapshot: velero-mysql-pv-claim-q6jgv in namesapce wordpress
Created volumesnapshot: velero-mysql-pv-claim-q6jgv in dm-wp-backup-snap-76mxp-hzb2f
name: velero-wp-pv-claim-p4lhl, uid: ada383d6-c23d-48fc-93fd-cad20f863cf4, pvc: wp-pv-claim, content_name: snapcontent-ada383d6-c23d-48fc-93fd-cad20f863cf4
Deleted volumesnapshot: velero-wp-pv-claim-p4lhl in namesapce wordpress
Created volumesnapshot: velero-wp-pv-claim-p4lhl in dm-wp-backup-snap-76mxp-hzb2f
=== Step 2. Update volumesnapshot content to new volumesnapshot in temporary namespace
Update volumesnapshotcontent snapcontent-532b6050-1bd7-4a6f-abfc-1a900bb52fc1 to remove snapshot reference
Update volumesnapshotcontent snapcontent-ada383d6-c23d-48fc-93fd-cad20f863cf4 to remove snapshot reference
=== Step 3. Create pvc reference to the new volumesnapshot in temporary namespace
Created pvc mysql-pv-claim in dm-wp-backup-snap-76mxp-hzb2f
Created pvc wp-pv-claim in dm-wp-backup-snap-76mxp-hzb2f
=== Step 4. Recreate pvc to reference pv created in step 3
Get pvc mysql-pv-claim and pv pvc-7fb33118-02a7-42db-9b18-2ba2a88c1346
Patch pv pvc-7fb33118-02a7-42db-9b18-2ba2a88c1346 with retain option
Deleted pvc mysql-pv-claim
Update pv pvc-7fb33118-02a7-42db-9b18-2ba2a88c1346 to remove reference in dm-wp-backup-snap-76mxp-hzb2f
Update pv pvc-7fb33118-02a7-42db-9b18-2ba2a88c1346 to remove reference in dm-wp-backup-snap-76mxp-hzb2f
Create pvc mysql-pv-claim in dm-wp-backup-snap-76mxp-hzb2f with pv pvc-7fb33118-02a7-42db-9b18-2ba2a88c1346
Patch pv pvc-7fb33118-02a7-42db-9b18-2ba2a88c1346 with delete option
Get pvc wp-pv-claim and pv pvc-297cb6ad-322b-4a9a-80a8-e51057d0e28a
Patch pv pvc-297cb6ad-322b-4a9a-80a8-e51057d0e28a with retain option
Deleted pvc wp-pv-claim
Update pv pvc-297cb6ad-322b-4a9a-80a8-e51057d0e28a to remove reference in dm-wp-backup-snap-76mxp-hzb2f
Update pv pvc-297cb6ad-322b-4a9a-80a8-e51057d0e28a to remove reference in dm-wp-backup-snap-76mxp-hzb2f
Create pvc wp-pv-claim in dm-wp-backup-snap-76mxp-hzb2f with pv pvc-297cb6ad-322b-4a9a-80a8-e51057d0e28a
Patch pv pvc-297cb6ad-322b-4a9a-80a8-e51057d0e28a with delete option
=== Step 5. Create pod with pvc created in step 4
build stage pod wordpress-589f976cd5-vbj5z
build stage pod wordpress-mysql-d9b8d8884-9g4r5
=== Step 6. Invoke velero to backup the temporary namespace using file system copy
Get velero backup plan wp-backup-snap-76mxp-hzb2f
Created velero backup plan generate-backup-kql6f

2. Recover data

root data-mover % go run main.go --action restore --backupName wp-backup-snap-76mxp-hzb2f --namespace wordpress
=== Step 1. Get filesystem copy backup
generate-backup-kql6f
=== Step 2. Delete namespace
=== Step 3. Invoke velero to restore the temporary namespace to given namespace
Created velero restore plan generate-restore-ppdmp
=== Step 4. Delete pod in given namespace
Deleted pod stage-wordpress-589f976cd5-vbj5z-d4zg7
Deleted pod stage-wordpress-mysql-d9b8d8884-9g4r5-xzr8r
=== Step 5. Invoke velero to restore original namespace
Created velero restore plan generate-restore-tfqhz

reference resources

Container Storage Interface Snapshot Support in Velero

https://velero.io/docs/v1.7/csi/#docs

Backup Storage Locations and Volume Snapshot Locations

https://velero.io/docs/v1.7/locations/#limitations--caveats

What resources does Velero back up

What resources does Velero back up - Kubernetes velero Chinese community

Escorting cloud native critical workloads -- Velero backup disaster recovery best practices

Escorting cloud native critical workloads -- Velero backup disaster recovery best practices

Keywords: Kubernetes Open Source

Added by tores on Fri, 28 Jan 2022 13:26:52 +0200