Kubernetes CEPH CSI analysis directory navigation
CEPH CSI source code analysis (6) - RBD driver nodeserver analysis (Part 2)
When the driver type specified when the CEPH CSI component is started is rbd, the services related to rbd driver will be started. Then, according to the parameter configuration of controllerserver and nodeserver, decide to start controllerserver and IdentityServer, or nodeserver and IdentityServer.
Based on tag v3 zero
https://github.com/ceph/ceph-csi/releases/tag/v3.0.0
rbd driver analysis will be divided into four parts: service entry analysis, controller server analysis, nodeserver analysis and identity server analysis.
nodeserver mainly includes NodeGetCapabilities, NodeGetVolumeStats, NodeStageVolume (map rbd and mount stagingPath), NodePublishVolume (mount targetPath), nodeuublishvolume (umount targetPath), NodeUnstageVolume (umount stagingPath and unmap rbd) NodeExpandVolume operations will be analyzed one by one. This section analyzes NodeStageVolume, NodePublishVolume, NodeUnpublishVolume and NodeUnstageVolume.
nodeserver analysis (Part 2)
ceph rbd mount knowledge explanation
rbd image map blocking device mainly has two ways: (1) through RBD Kernel Module and (2) through RBD-NBD. reference resources: https://www.jianshu.com/p/bb9d14bd897c , http://xiaqunfeng.cc/2017/06/07/Map-RBD-Devices-on-NBD/
A ceph rbd image is attached to the pod in two steps:
1. The kubelet component calls the NodeStageVolume method of rbdtype nodeserver CEPH CSI, maps the RBD image to the rbd/nbd device on the node, and then formats and mount s the rbd device to the staging path;
2. The kubelet component calls the NodePublishVolume method of rbdtype nodeserver CEPH CSI to mount the staging path in the previous step to the target path.
Explanation of ceph rbd unmounting knowledge
A ceph rbd image is unmounted from the pod in two steps, as follows:
1. The kubelet component calls the NodeUnpublishVolume method of rbdtype nodeserver CEPH CSI to remove the mounting relationship between stagingPath and targetPath.
2. The kubelet component calls the NodeUnstageVolume method of rbdtype nodeserver CEPH CSI, first remove the mounting relationship between targetPath and rbd/nbd device, and then unmap rbd/nbd device (that is, remove the mounting of rbd/nbd device and ceph rbd image on the node side).
After the rbd image is mounted to the pod, two mount relationships will appear on the node, as shown in the following example:
# mount | grep nbd /dev/nbd0 on /home/cld/kubernetes/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e2104b0f-774e-420e-a388-1705344084a4/globalmount/0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-40b130e1-a630-11eb-8bea-246e968ec20c type xfs (rw,relatime,nouuid,attr2,inode64,noquota,_netdev) /dev/nbd0 on /home/cld/kubernetes/lib/kubelet/pods/80114f88-2b09-440c-aec2-54c16efe6923/volumes/kubernetes.io~csi/pvc-e2104b0f-774e-420e-a388-1705344084a4/mount type xfs (rw,relatime,nouuid,attr2,inode64,noquota,_netdev)
Where / home / CLD / kubernetes / lib / kubelet / plugins / kubernetes IO / CSI / PV / pvc-e2104b0f-774e-420e-a388-1705344084a4 / globalmount / 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-40b130e1-a630-11eb-8bea-246e968ec20c is the staging path; And / home / CLD / kubernetes / lib / kubelet / pods / 80114f88-2b09-440c-aec2-54c16efe6923 / volumes / kubernetes IO ~ CSI / pvc-e2104b0f-774e-420e-a388-1705344084a4 / mount is target path and / dev/nbd0 is nbd device.
be careful
When an rbd image is attached to multiple pods on a node, the NodeStageVolume method will only be called once, and NodePublishVolume will be called multiple times, that is, in this case, there will be only one staging path and multiple target paths. You can understand that the staging path corresponds to rbd image and the target path corresponds to pod. Therefore, when an rbd image is mounted to multiple pods on a node, there is only one staging path and multiple target paths.
The same is true for unmounting. The NodeUnstageVolume method will be called only when all pod s of an rbd image are deleted.
(4)NodeStageVolume
brief introduction
Map the RBD image to the rbd/nbd device on the node, format it and mount it to the staging path.
NodeStageVolume mounts the volume to a staging path on the node.
- Stash image metadata under staging path
- Map the image (creates a device)
- Create the staging file/directory under staging path
- Stage the device (mount the device mapped for image)
Main steps:
(1) Map RBD image to rbd/nbd device on node;
(2) Format rbd device (do not format when volumeMode is block) and mount it to staging path.
NodeStageVolume
NodeStageVolume main process:
(1) Verify request parameters and AccessMode;
(2) Obtain volID from request parameters;
(3) Build the ceph request voucher according to the secret (the secret is passed in by kubelet);
(4) Check whether the stagingPath exists and has been mount ed;
(5) Obtain image name from volume journal according to voluid;
(6) Create image meta. Under stagingParentPath JSON, which is used to store the metadata of image;
(7) Call ns Stagetransaction performs map and mount operations.
//ceph-csi/internal/rbd/nodeserver.go func (ns *NodeServer) NodeStageVolume(ctx context.Context, req *csi.NodeStageVolumeRequest) (*csi.NodeStageVolumeResponse, error) { // (1) Verification request parameters if err := util.ValidateNodeStageVolumeRequest(req); err != nil { return nil, err } // Verify AccessMode isBlock := req.GetVolumeCapability().GetBlock() != nil disableInUseChecks := false // MULTI_NODE_MULTI_WRITER is supported by default for Block access type volumes if req.VolumeCapability.AccessMode.Mode == csi.VolumeCapability_AccessMode_MULTI_NODE_MULTI_WRITER { if !isBlock { klog.Warningf(util.Log(ctx, "MULTI_NODE_MULTI_WRITER currently only supported with volumes of access type `block`, invalid AccessMode for volume: %v"), req.GetVolumeId()) return nil, status.Error(codes.InvalidArgument, "rbd: RWX access mode request is only valid for volumes with access type `block`") } disableInUseChecks = true } // (2) Get volID from request parameters volID := req.GetVolumeId() // (3) Build ceph request voucher according to secret cr, err := util.NewUserCredentials(req.GetSecrets()) if err != nil { return nil, status.Error(codes.Internal, err.Error()) } defer cr.DeleteCredentials() if acquired := ns.VolumeLocks.TryAcquire(volID); !acquired { klog.Errorf(util.Log(ctx, util.VolumeOperationAlreadyExistsFmt), volID) return nil, status.Errorf(codes.Aborted, util.VolumeOperationAlreadyExistsFmt, volID) } defer ns.VolumeLocks.Release(volID) stagingParentPath := req.GetStagingTargetPath() stagingTargetPath := stagingParentPath + "/" + volID // check is it a static volume staticVol := false val, ok := req.GetVolumeContext()["staticVolume"] if ok { if staticVol, err = strconv.ParseBool(val); err != nil { return nil, status.Error(codes.InvalidArgument, err.Error()) } } // (4) Check whether the stagingPath exists and has been mount ed var isNotMnt bool // check if stagingPath is already mounted isNotMnt, err = mount.IsNotMountPoint(ns.mounter, stagingTargetPath) if err != nil && !os.IsNotExist(err) { return nil, status.Error(codes.Internal, err.Error()) } if !isNotMnt { util.DebugLog(ctx, "rbd: volume %s is already mounted to %s, skipping", volID, stagingTargetPath) return &csi.NodeStageVolumeResponse{}, nil } volOptions, err := genVolFromVolumeOptions(ctx, req.GetVolumeContext(), req.GetSecrets(), disableInUseChecks) if err != nil { return nil, status.Error(codes.Internal, err.Error()) } // (5) Get image name from volume journal according to voluid // get rbd image name from the volume journal // for static volumes, the image name is actually the volume ID itself switch { case staticVol: volOptions.RbdImageName = volID default: var vi util.CSIIdentifier var imageAttributes *journal.ImageAttributes err = vi.DecomposeCSIID(volID) if err != nil { err = fmt.Errorf("error decoding volume ID (%s) (%s)", err, volID) return nil, status.Error(codes.Internal, err.Error()) } j, err2 := volJournal.Connect(volOptions.Monitors, cr) if err2 != nil { klog.Errorf( util.Log(ctx, "failed to establish cluster connection: %v"), err2) return nil, status.Error(codes.Internal, err.Error()) } defer j.Destroy() imageAttributes, err = j.GetImageAttributes( ctx, volOptions.Pool, vi.ObjectUUID, false) if err != nil { err = fmt.Errorf("error fetching image attributes for volume ID (%s) (%s)", err, volID) return nil, status.Error(codes.Internal, err.Error()) } volOptions.RbdImageName = imageAttributes.ImageName } volOptions.VolID = volID transaction := stageTransaction{} // (6) Create image meta. Under stagingParentPath JSON, which is used to store the metadata of the image // Stash image details prior to mapping the image (useful during Unstage as it has no // voloptions passed to the RPC as per the CSI spec) err = stashRBDImageMetadata(volOptions, stagingParentPath) if err != nil { return nil, status.Error(codes.Internal, err.Error()) } defer func() { if err != nil { ns.undoStagingTransaction(ctx, req, transaction) } }() // (7) Call ns Stagetransaction performs map/mount operations // perform the actual staging and if this fails, have undoStagingTransaction // cleans up for us transaction, err = ns.stageTransaction(ctx, req, volOptions, staticVol) if err != nil { return nil, status.Error(codes.Internal, err.Error()) } util.DebugLog(ctx, "rbd: successfully mounted volume %s to stagingTargetPath %s", req.GetVolumeId(), stagingTargetPath) return &csi.NodeStageVolumeResponse{}, nil }
1.ValidateNodeStageVolumeRequest
ValidateNodeStageVolumeRequest verifies the following:
(1) volume capability parameter cannot be empty;
(2) volume ID parameter cannot be empty;
(3) The staging target path parameter cannot be empty;
(4) The stage secrets parameter cannot be empty;
(5) Whether the staging path exists on dnode.
//ceph-csi/internal/util/validate.go func ValidateNodeStageVolumeRequest(req *csi.NodeStageVolumeRequest) error { if req.GetVolumeCapability() == nil { return status.Error(codes.InvalidArgument, "volume capability missing in request") } if req.GetVolumeId() == "" { return status.Error(codes.InvalidArgument, "volume ID missing in request") } if req.GetStagingTargetPath() == "" { return status.Error(codes.InvalidArgument, "staging target path missing in request") } if req.GetSecrets() == nil || len(req.GetSecrets()) == 0 { return status.Error(codes.InvalidArgument, "stage secrets cannot be nil or empty") } // validate stagingpath exists ok := checkDirExists(req.GetStagingTargetPath()) if !ok { return status.Error(codes.InvalidArgument, "staging path does not exists on node") } return nil }
2.stashRBDImageMetadata
stashRBDImageMetadata creates image meta under stagingParentPath JSON, which is used to store the metadata of the image.
//ceph-csi/internal/rbd/rbd_util.go const stashFileName = "image-meta.json" func stashRBDImageMetadata(volOptions *rbdVolume, path string) error { var imgMeta = rbdImageMetadataStash{ // there are no checks for this at present Version: 2, // nolint:gomnd // number specifies version. Pool: volOptions.Pool, ImageName: volOptions.RbdImageName, Encrypted: volOptions.Encrypted, } imgMeta.NbdAccess = false if volOptions.Mounter == rbdTonbd && hasNBD { imgMeta.NbdAccess = true } encodedBytes, err := json.Marshal(imgMeta) if err != nil { return fmt.Errorf("failed to marshall JSON image metadata for image (%s): (%v)", volOptions, err) } fPath := filepath.Join(path, stashFileName) err = ioutil.WriteFile(fPath, encodedBytes, 0600) if err != nil { return fmt.Errorf("failed to stash JSON image metadata for image (%s) at path (%s): (%v)", volOptions, fPath, err) } return nil }
root@cld-dnode3-1091:/home/zhongjialiang# ls /home/cld/kubernetes/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/globalmount/ image-meta.json 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74/
3.ns.stageTransaction
Main process:
(1) Call attachRBDImage to map RBD device to dnode;
(2) Call ns Mountvolumetostagepath formats the rbd device on dnode and mount s it to StagePath.
//ceph-csi/internal/rbd/nodeserver.go func (ns *NodeServer) stageTransaction(ctx context.Context, req *csi.NodeStageVolumeRequest, volOptions *rbdVolume, staticVol bool) (stageTransaction, error) { transaction := stageTransaction{} var err error var readOnly bool var feature bool var cr *util.Credentials cr, err = util.NewUserCredentials(req.GetSecrets()) if err != nil { return transaction, err } defer cr.DeleteCredentials() err = volOptions.Connect(cr) if err != nil { klog.Errorf(util.Log(ctx, "failed to connect to volume %v: %v"), volOptions.RbdImageName, err) return transaction, err } defer volOptions.Destroy() // Allow image to be mounted on multiple nodes if it is ROX if req.VolumeCapability.AccessMode.Mode == csi.VolumeCapability_AccessMode_MULTI_NODE_READER_ONLY { util.ExtendedLog(ctx, "setting disableInUseChecks on rbd volume to: %v", req.GetVolumeId) volOptions.DisableInUseChecks = true volOptions.readOnly = true } if kernelRelease == "" { // fetch the current running kernel info kernelRelease, err = util.GetKernelVersion() if err != nil { return transaction, err } } if !util.CheckKernelSupport(kernelRelease, deepFlattenSupport) { if !skipForceFlatten { feature, err = volOptions.checkImageChainHasFeature(ctx, librbd.FeatureDeepFlatten) if err != nil { return transaction, err } if feature { err = volOptions.flattenRbdImage(ctx, cr, true, rbdHardMaxCloneDepth, rbdSoftMaxCloneDepth) if err != nil { return transaction, err } } } } // Mapping RBD image var devicePath string devicePath, err = attachRBDImage(ctx, volOptions, cr) if err != nil { return transaction, err } transaction.devicePath = devicePath util.DebugLog(ctx, "rbd image: %s/%s was successfully mapped at %s\n", req.GetVolumeId(), volOptions.Pool, devicePath) if volOptions.Encrypted { devicePath, err = ns.processEncryptedDevice(ctx, volOptions, devicePath) if err != nil { return transaction, err } transaction.isEncrypted = true } stagingTargetPath := getStagingTargetPath(req) isBlock := req.GetVolumeCapability().GetBlock() != nil err = ns.createStageMountPoint(ctx, stagingTargetPath, isBlock) if err != nil { return transaction, err } transaction.isStagePathCreated = true // nodeStage Path readOnly, err = ns.mountVolumeToStagePath(ctx, req, staticVol, stagingTargetPath, devicePath) if err != nil { return transaction, err } transaction.isMounted = true if !readOnly { // #nosec - allow anyone to write inside the target path err = os.Chmod(stagingTargetPath, 0777) } return transaction, err }
3.1 attachRBDImage
attachRBDImage main process:
(1) Call waitForPath to determine whether the image has been map ped to the node;
(2) When there is no map on the node, call waitForrbdImage to judge whether the image exists and has been used;
(3) Call createPath to map the image to node.
//ceph-csi/internal/rbd/rbd_attach.go func attachRBDImage(ctx context.Context, volOptions *rbdVolume, cr *util.Credentials) (string, error) { var err error image := volOptions.RbdImageName useNBD := false if volOptions.Mounter == rbdTonbd && hasNBD { useNBD = true } devicePath, found := waitForPath(ctx, volOptions.Pool, image, 1, useNBD) if !found { backoff := wait.Backoff{ Duration: rbdImageWatcherInitDelay, Factor: rbdImageWatcherFactor, Steps: rbdImageWatcherSteps, } err = waitForrbdImage(ctx, backoff, volOptions) if err != nil { return "", err } devicePath, err = createPath(ctx, volOptions, cr) } return devicePath, err }
createPath splices the ceph command, and then executes the map command to map the RBD image to dnode as rbd device.
RBD NBD mount mode, specified by – device type = NBD.
func createPath(ctx context.Context, volOpt *rbdVolume, cr *util.Credentials) (string, error) { isNbd := false imagePath := volOpt.String() util.TraceLog(ctx, "rbd: map mon %s", volOpt.Monitors) // Map options mapOptions := []string{ "--id", cr.ID, "-m", volOpt.Monitors, "--keyfile=" + cr.KeyFile, "map", imagePath, } // Choose access protocol accessType := accessTypeKRbd if volOpt.Mounter == rbdTonbd && hasNBD { isNbd = true accessType = accessTypeNbd } // Update options with device type selection mapOptions = append(mapOptions, "--device-type", accessType) if volOpt.readOnly { mapOptions = append(mapOptions, "--read-only") } // Execute map stdout, stderr, err := util.ExecCommand(ctx, rbd, mapOptions...) if err != nil { klog.Warningf(util.Log(ctx, "rbd: map error %v, rbd output: %s"), err, stderr) // unmap rbd image if connection timeout if strings.Contains(err.Error(), rbdMapConnectionTimeout) { detErr := detachRBDImageOrDeviceSpec(ctx, imagePath, true, isNbd, volOpt.Encrypted, volOpt.VolID) if detErr != nil { klog.Warningf(util.Log(ctx, "rbd: %s unmap error %v"), imagePath, detErr) } } return "", fmt.Errorf("rbd: map failed with error %v, rbd error output: %s", err, stderr) } devicePath := strings.TrimSuffix(stdout, "\n") return devicePath, nil }
3.2 mountVolumeToStagePath
Main process:
(1) When volumeMode is Filesystem, run mkfs to format rbd device;
(2) Mount rbd device to stagingPath.
//ceph-csi/internal/rbd/nodeserver.go func (ns *NodeServer) mountVolumeToStagePath(ctx context.Context, req *csi.NodeStageVolumeRequest, staticVol bool, stagingPath, devicePath string) (bool, error) { readOnly := false fsType := req.GetVolumeCapability().GetMount().GetFsType() diskMounter := &mount.SafeFormatAndMount{Interface: ns.mounter, Exec: utilexec.New()} // rbd images are thin-provisioned and return zeros for unwritten areas. A freshly created // image will not benefit from discard and we also want to avoid as much unnecessary zeroing // as possible. Open-code mkfs here because FormatAndMount() doesn't accept custom mkfs // options. // // Note that "freshly" is very important here. While discard is more of a nice to have, // lazy_journal_init=1 is plain unsafe if the image has been written to before and hasn't // been zeroed afterwards (unlike the name suggests, it leaves the journal completely // uninitialized and carries a risk until the journal is overwritten and wraps around for // the first time). existingFormat, err := diskMounter.GetDiskFormat(devicePath) if err != nil { klog.Errorf(util.Log(ctx, "failed to get disk format for path %s, error: %v"), devicePath, err) return readOnly, err } opt := []string{"_netdev"} opt = csicommon.ConstructMountOptions(opt, req.GetVolumeCapability()) isBlock := req.GetVolumeCapability().GetBlock() != nil rOnly := "ro" if req.VolumeCapability.AccessMode.Mode == csi.VolumeCapability_AccessMode_MULTI_NODE_READER_ONLY || req.VolumeCapability.AccessMode.Mode == csi.VolumeCapability_AccessMode_SINGLE_NODE_READER_ONLY { if !csicommon.MountOptionContains(opt, rOnly) { opt = append(opt, rOnly) } } if csicommon.MountOptionContains(opt, rOnly) { readOnly = true } if fsType == "xfs" { opt = append(opt, "nouuid") } if existingFormat == "" && !staticVol && !readOnly { args := []string{} if fsType == "ext4" { args = []string{"-m0", "-Enodiscard,lazy_itable_init=1,lazy_journal_init=1", devicePath} } else if fsType == "xfs" { args = []string{"-K", devicePath} // always disable reflink // TODO: make enabling an option, see ceph/ceph-csi#1256 if ns.xfsSupportsReflink() { args = append(args, "-m", "reflink=0") } } if len(args) > 0 { cmdOut, cmdErr := diskMounter.Exec.Command("mkfs."+fsType, args...).CombinedOutput() if cmdErr != nil { klog.Errorf(util.Log(ctx, "failed to run mkfs error: %v, output: %v"), cmdErr, string(cmdOut)) return readOnly, cmdErr } } } if isBlock { opt = append(opt, "bind") err = diskMounter.Mount(devicePath, stagingPath, fsType, opt) } else { err = diskMounter.FormatAndMount(devicePath, stagingPath, fsType, opt) } if err != nil { klog.Errorf(util.Log(ctx, "failed to mount device path (%s) to staging path (%s) for volume "+ "(%s) error: %s Check dmesg logs if required."), devicePath, stagingPath, req.GetVolumeId(), err) } return readOnly, err }
Example of CEPH CSI component log
Operation: NodeStageVolume
Source: daemon: CSI rbdplugin, container: CSI rbdplugin
I0828 06:25:07.604431 3316053 utils.go:159] ID: 12008 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 GRPC call: /csi.v1.Node/NodeStageVolume I0828 06:25:07.607979 3316053 utils.go:160] ID: 12008 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 GRPC request: {"secrets":"***stripped***","staging_target_path":"/home/cld/kubernetes/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/globalmount","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4","mount_flags":["discard"]}},"access_mode":{"mode":1}},"volume_context":{"clusterID":"0bba3be9-0a1c-41db-a619-26ffea20161e","imageFeatures":"layering","imageName":"csi-vol-1699e662-e83f-11ea-8e79-246e96907f74","journalPool":"kubernetes","pool":"kubernetes","storage.kubernetes.io/csiProvisionerIdentity":"1598236777786-8081-rbd.csi.ceph.com"},"volume_id":"0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74"} I0828 06:25:07.608239 3316053 rbd_util.go:722] ID: 12008 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 setting disableInUseChecks on rbd volume to: false I0828 06:25:07.610528 3316053 omap.go:74] ID: 12008 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 got omap values: (pool="kubernetes", namespace="", name="csi.volume.1699e662-e83f-11ea-8e79-246e96907f74"): map[csi.imageid:e583b827ec63 csi.imagename:csi-vol-1699e662-e83f-11ea-8e79-246e96907f74 csi.volname:pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893] E0828 06:25:07.610765 3316053 util.go:236] kernel 4.19.0-8-amd64 does not support required features I0828 06:25:07.786825 3316053 cephcmds.go:60] ID: 12008 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 command succeeded: rbd [device list --format=json --device-type krbd] I0828 06:25:07.832097 3316053 rbd_attach.go:208] ID: 12008 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 rbd: map mon 10.248.32.13:6789,10.248.32.14:6789,10.248.32.15:6789 I0828 06:25:07.926180 3316053 cephcmds.go:60] ID: 12008 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 command succeeded: rbd [--id kubernetes -m 10.248.32.13:6789,10.248.32.14:6789,10.248.32.15:6789 --keyfile=***stripped*** map kubernetes/csi-vol-1699e662-e83f-11ea-8e79-246e96907f74 --device-type krbd] I0828 06:25:07.926221 3316053 nodeserver.go:291] ID: 12008 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 rbd image: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74/kubernetes was successfully mapped at /dev/rbd0 I0828 06:25:08.157777 3316053 nodeserver.go:230] ID: 12008 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 rbd: successfully mounted volume 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 to stagingTargetPath /home/cld/kubernetes/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/globalmount/0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 I0828 06:25:08.158588 3316053 utils.go:165] ID: 12008 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 GRPC response: {}
(5)NodePublishVolume
brief introduction
The stagemount path method in the stagemount volume.
NodeStageVolume maps RBD image to dnode as device, and then mounts device to a staging path.
NodePublishVolume mounts the stagingpath to the target path.
stagingPath Example: /home/cld/kubernetes/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/globalmount/0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 targetPath Example: /home/cld/kubernetes/lib/kubelet/pods/c14de522-0679-44b6-af8b-e1ba08b5b004/volumes/kubernetes.io~csi/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/mount
NodePublishVolume
Main process:
(1) Check request parameters;
(2) Check whether the target path exists. If it does not exist, create it;
(3) Mount staging path to target path.
//ceph-csi/internal/rbd/nodeserver.go func (ns *NodeServer) NodePublishVolume(ctx context.Context, req *csi.NodePublishVolumeRequest) (*csi.NodePublishVolumeResponse, error) { err := util.ValidateNodePublishVolumeRequest(req) if err != nil { return nil, err } targetPath := req.GetTargetPath() isBlock := req.GetVolumeCapability().GetBlock() != nil stagingPath := req.GetStagingTargetPath() volID := req.GetVolumeId() stagingPath += "/" + volID if acquired := ns.VolumeLocks.TryAcquire(volID); !acquired { klog.Errorf(util.Log(ctx, util.VolumeOperationAlreadyExistsFmt), volID) return nil, status.Errorf(codes.Aborted, util.VolumeOperationAlreadyExistsFmt, volID) } defer ns.VolumeLocks.Release(volID) // Check if that target path exists properly notMnt, err := ns.createTargetMountPath(ctx, targetPath, isBlock) if err != nil { return nil, err } if !notMnt { return &csi.NodePublishVolumeResponse{}, nil } // Publish Path err = ns.mountVolume(ctx, stagingPath, req) if err != nil { return nil, err } util.DebugLog(ctx, "rbd: successfully mounted stagingPath %s to targetPath %s", stagingPath, targetPath) return &csi.NodePublishVolumeResponse{}, nil }
1.ValidateNodePublishVolumeRequest
Validatenodebublishvolumerequest is mainly used to verify some request parameters. Verify that volume capacity / volume ID / target path / staging target path cannot be empty.
//ceph-csi/internal/util/validate.go func ValidateNodePublishVolumeRequest(req *csi.NodePublishVolumeRequest) error { if req.GetVolumeCapability() == nil { return status.Error(codes.InvalidArgument, "volume capability missing in request") } if req.GetVolumeId() == "" { return status.Error(codes.InvalidArgument, "volume ID missing in request") } if req.GetTargetPath() == "" { return status.Error(codes.InvalidArgument, "target path missing in request") } if req.GetStagingTargetPath() == "" { return status.Error(codes.InvalidArgument, "staging target path missing in request") } return nil }
2.createTargetMountPath
createTargetMountPath mainly checks whether the mount path exists. If it does not exist, it is created
//ceph-csi/internal/rbd/nodeserver.go func (ns *NodeServer) createTargetMountPath(ctx context.Context, mountPath string, isBlock bool) (bool, error) { // Check if that mount path exists properly notMnt, err := mount.IsNotMountPoint(ns.mounter, mountPath) if err != nil { if os.IsNotExist(err) { if isBlock { // #nosec pathFile, e := os.OpenFile(mountPath, os.O_CREATE|os.O_RDWR, 0750) if e != nil { util.DebugLog(ctx, "Failed to create mountPath:%s with error: %v", mountPath, err) return notMnt, status.Error(codes.Internal, e.Error()) } if err = pathFile.Close(); err != nil { util.DebugLog(ctx, "Failed to close mountPath:%s with error: %v", mountPath, err) return notMnt, status.Error(codes.Internal, err.Error()) } } else { // Create a directory if err = util.CreateMountPoint(mountPath); err != nil { return notMnt, status.Error(codes.Internal, err.Error()) } } notMnt = true } else { return false, status.Error(codes.Internal, err.Error()) } } return notMnt, err }
3.mountVolume
mountVolume is mainly used to piece up the mount command and mount the staging path to the target path
//ceph-csi/internal/rbd/nodeserver.go func (ns *NodeServer) mountVolume(ctx context.Context, stagingPath string, req *csi.NodePublishVolumeRequest) error { // Publish Path fsType := req.GetVolumeCapability().GetMount().GetFsType() readOnly := req.GetReadonly() mountOptions := []string{"bind", "_netdev"} isBlock := req.GetVolumeCapability().GetBlock() != nil targetPath := req.GetTargetPath() mountOptions = csicommon.ConstructMountOptions(mountOptions, req.GetVolumeCapability()) util.DebugLog(ctx, "target %v\nisBlock %v\nfstype %v\nstagingPath %v\nreadonly %v\nmountflags %v\n", targetPath, isBlock, fsType, stagingPath, readOnly, mountOptions) if readOnly { mountOptions = append(mountOptions, "ro") } if err := util.Mount(stagingPath, targetPath, fsType, mountOptions); err != nil { return status.Error(codes.Internal, err.Error()) } return nil }
Example of CEPH CSI component log
Action: NodePublishVolume
Source: daemon: CSI rbdplugin, container: CSI rbdplugin
I0828 06:25:08.172901 3316053 utils.go:159] ID: 12010 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 GRPC call: /csi.v1.Node/NodePublishVolume I0828 06:25:08.176683 3316053 utils.go:160] ID: 12010 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 GRPC request: {"staging_target_path":"/home/cld/kubernetes/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/globalmount","target_path":"/home/cld/kubernetes/lib/kubelet/pods/c14de522-0679-44b6-af8b-e1ba08b5b004/volumes/kubernetes.io~csi/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/mount","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4","mount_flags":["discard"]}},"access_mode":{"mode":1}},"volume_context":{"clusterID":"0bba3be9-0a1c-41db-a619-26ffea20161e","imageFeatures":"layering","imageName":"csi-vol-1699e662-e83f-11ea-8e79-246e96907f74","journalPool":"kubernetes","pool":"kubernetes","storage.kubernetes.io/csiProvisionerIdentity":"1598236777786-8081-rbd.csi.ceph.com"},"volume_id":"0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74"} I0828 06:25:08.177363 3316053 nodeserver.go:518] ID: 12010 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 target /home/cld/kubernetes/lib/kubelet/pods/c14de522-0679-44b6-af8b-e1ba08b5b004/volumes/kubernetes.io~csi/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/mount isBlock false fstype ext4 stagingPath /home/cld/kubernetes/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/globalmount/0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 readonly false mountflags [bind _netdev discard] I0828 06:25:08.191877 3316053 nodeserver.go:426] ID: 12010 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 rbd: successfully mounted stagingPath /home/cld/kubernetes/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/globalmount/0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 to targetPath /home/cld/kubernetes/lib/kubelet/pods/c14de522-0679-44b6-af8b-e1ba08b5b004/volumes/kubernetes.io~csi/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/mount I0828 06:25:08.192653 3316053 utils.go:165] ID: 12010 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 GRPC response: {}
From the log, we can see some parameter values of the mount command
target /home/cld/kubernetes/lib/kubelet/pods/c14de522-0679-44b6-af8b-e1ba08b5b004/volumes/kubernetes.io~csi/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/mount isBlock false fstype ext4 stagingPath /home/cld/kubernetes/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/globalmount/0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 readonly false mountflags [bind _netdev discard]
(6)NodeUnpublishVolume
brief introduction
Unmount the stagingPath to targetPath.
NodeUnpublishVolume unmounts the volume from the target path.
NodeUnpublishVolume
Main process:
(1) Check request parameters;
(2) Judge whether the specified path is a mount point;
(3) Remove the mounting from stagingPath to targetPath;
(4) Delete the targetPath directory and any subdirectories it contains.
//ceph-csi/internal/rbd/nodeserver.go func (ns *NodeServer) NodeUnpublishVolume(ctx context.Context, req *csi.NodeUnpublishVolumeRequest) (*csi.NodeUnpublishVolumeResponse, error) { // (1) Check request parameters; err := util.ValidateNodeUnpublishVolumeRequest(req) if err != nil { return nil, err } targetPath := req.GetTargetPath() volID := req.GetVolumeId() if acquired := ns.VolumeLocks.TryAcquire(volID); !acquired { klog.Errorf(util.Log(ctx, util.VolumeOperationAlreadyExistsFmt), volID) return nil, status.Errorf(codes.Aborted, util.VolumeOperationAlreadyExistsFmt, volID) } defer ns.VolumeLocks.Release(volID) // (2) Judge whether the specified path is mountpoint notMnt, err := mount.IsNotMountPoint(ns.mounter, targetPath) if err != nil { if os.IsNotExist(err) { // targetPath has already been deleted util.DebugLog(ctx, "targetPath: %s has already been deleted", targetPath) return &csi.NodeUnpublishVolumeResponse{}, nil } return nil, status.Error(codes.NotFound, err.Error()) } if notMnt { if err = os.RemoveAll(targetPath); err != nil { return nil, status.Error(codes.Internal, err.Error()) } return &csi.NodeUnpublishVolumeResponse{}, nil } // (3)unmount targetPath; if err = ns.mounter.Unmount(targetPath); err != nil { return nil, status.Error(codes.Internal, err.Error()) } // (4) Delete the targetPath directory and any subdirectories it contains. if err = os.RemoveAll(targetPath); err != nil { return nil, status.Error(codes.Internal, err.Error()) } util.DebugLog(ctx, "rbd: successfully unbound volume %s from %s", req.GetVolumeId(), targetPath) return &csi.NodeUnpublishVolumeResponse{}, nil }
RemoveAll
Delete the targetPath directory and any subdirectories it contains.
//GO/src/os/path.go // RemoveAll removes path and any children it contains. // It removes everything it can but returns the first error // it encounters. If the path does not exist, RemoveAll // returns nil (no error). // If there is an error, it will be of type *PathError. func RemoveAll(path string) error { return removeAll(path) }
Example of CEPH CSI component log
Operation: NodeUnpublishVolume
Source: daemon: CSI rbdplugin, container: CSI rbdplugin
I0828 07:14:25.117004 3316053 utils.go:159] ID: 12123 GRPC call: /csi.v1.Node/NodeGetVolumeStats I0828 07:14:25.117825 3316053 utils.go:160] ID: 12123 GRPC request: {"volume_id":"0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74","volume_path":"/home/cld/kubernetes/lib/kubelet/pods/c14de522-0679-44b6-af8b-e1ba08b5b004/volumes/kubernetes.io~csi/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/mount"} I0828 07:14:25.128161 3316053 utils.go:165] ID: 12123 GRPC response: {"usage":[{"available":1003900928,"total":1023303680,"unit":1,"used":2625536},{"available":65525,"total":65536,"unit":2,"used":11}]} I0828 07:14:40.863935 3316053 utils.go:159] ID: 12124 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 GRPC call: /csi.v1.Node/NodeUnpublishVolume I0828 07:14:40.864889 3316053 utils.go:160] ID: 12124 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 GRPC request: {"target_path":"/home/cld/kubernetes/lib/kubelet/pods/c14de522-0679-44b6-af8b-e1ba08b5b004/volumes/kubernetes.io~csi/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/mount","volume_id":"0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74"} I0828 07:14:40.908930 3316053 nodeserver.go:601] ID: 12124 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 rbd: successfully unbound volume 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 from /home/cld/kubernetes/lib/kubelet/pods/c14de522-0679-44b6-af8b-e1ba08b5b004/volumes/kubernetes.io~csi/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/mount I0828 07:14:40.909906 3316053 utils.go:165] ID: 12124 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 GRPC response: {}
(7)NodeUnstageVolume
brief introduction
Remove the mounting of targetPath to rbd/nbd device first, and then unmap rbd/nbd device (that is, remove the mounting of node rbd/nbd device and ceph rbd image).
NodeUnstageVolume unstages the volume from the staging path.
NodeUnstageVolume
Main process:
(1) Check request parameters;
(2) Judge whether stagingTargetPath exists;
(3) Set stagingTargetPath unmount rbd device;
(4) Delete stagingTargetPath;
(5) From image meta Read the metadata of image in JSON file;
(6)unmap rbd device;
(7) Delete the metadata corresponding to the image, i.e. image meta JSON file.
//ceph-csi/internal/rbd/nodeserver.go func (ns *NodeServer) NodeUnstageVolume(ctx context.Context, req *csi.NodeUnstageVolumeRequest) (*csi.NodeUnstageVolumeResponse, error) { // (1) Check request parameters; var err error if err = util.ValidateNodeUnstageVolumeRequest(req); err != nil { return nil, err } volID := req.GetVolumeId() if acquired := ns.VolumeLocks.TryAcquire(volID); !acquired { klog.Errorf(util.Log(ctx, util.VolumeOperationAlreadyExistsFmt), volID) return nil, status.Errorf(codes.Aborted, util.VolumeOperationAlreadyExistsFmt, volID) } defer ns.VolumeLocks.Release(volID) stagingParentPath := req.GetStagingTargetPath() stagingTargetPath := getStagingTargetPath(req) // (2) Judge whether stagingTargetPath exists; notMnt, err := mount.IsNotMountPoint(ns.mounter, stagingTargetPath) if err != nil { if !os.IsNotExist(err) { return nil, status.Error(codes.NotFound, err.Error()) } // Continue on ENOENT errors as we may still have the image mapped notMnt = true } if !notMnt { // (3) Set stagingTargetPath unmount rbd device; // Unmounting the image err = ns.mounter.Unmount(stagingTargetPath) if err != nil { util.ExtendedLog(ctx, "failed to unmount targetPath: %s with error: %v", stagingTargetPath, err) return nil, status.Error(codes.Internal, err.Error()) } } // (4) Delete stagingTargetPath; // Example: / home / CLD / kubernetes / lib / kubelet / plugins / kubernetes io/csi/pv/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/globalmount/0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 if err = os.Remove(stagingTargetPath); err != nil { // Any error is critical as Staging path is expected to be empty by Kubernetes, it otherwise // keeps invoking Unstage. Hence any errors removing files within this path is a critical // error if !os.IsNotExist(err) { klog.Errorf(util.Log(ctx, "failed to remove staging target path (%s): (%v)"), stagingTargetPath, err) return nil, status.Error(codes.Internal, err.Error()) } } // (5) From image meta Read the metadata of image in JSON file; imgInfo, err := lookupRBDImageMetadataStash(stagingParentPath) if err != nil { util.UsefulLog(ctx, "failed to find image metadata: %v", err) // It is an error if it was mounted, as we should have found the image metadata file with // no errors if !notMnt { return nil, status.Error(codes.Internal, err.Error()) } // If not mounted, and error is anything other than metadata file missing, it is an error if !errors.Is(err, ErrMissingStash) { return nil, status.Error(codes.Internal, err.Error()) } // It was not mounted and image metadata is also missing, we are done as the last step in // in the staging transaction is complete return &csi.NodeUnstageVolumeResponse{}, nil } // (6)unmap rbd device; // Unmapping rbd device imageSpec := imgInfo.String() if err = detachRBDImageOrDeviceSpec(ctx, imageSpec, true, imgInfo.NbdAccess, imgInfo.Encrypted, req.GetVolumeId()); err != nil { klog.Errorf(util.Log(ctx, "error unmapping volume (%s) from staging path (%s): (%v)"), req.GetVolumeId(), stagingTargetPath, err) return nil, status.Error(codes.Internal, err.Error()) } util.DebugLog(ctx, "successfully unmounted volume (%s) from staging path (%s)", req.GetVolumeId(), stagingTargetPath) // (7) Delete the metadata corresponding to the image, i.e. image meta JSON file. if err = cleanupRBDImageMetadataStash(stagingParentPath); err != nil { klog.Errorf(util.Log(ctx, "failed to cleanup image metadata stash (%v)"), err) return nil, status.Error(codes.Internal, err.Error()) } return &csi.NodeUnstageVolumeResponse{}, nil }
root@cld-dnode3-1091:/home/zhongjialiang# ls /home/cld/kubernetes/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/globalmount/ image-meta.json 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74/
1.lookupRBDImageMetadataStash
From image meta The metadata of the image is read from the JSON file.
//ceph-csi/internal/rbd/rbd_util.go // file name in which image metadata is stashed. const stashFileName = "image-meta.json" func lookupRBDImageMetadataStash(path string) (rbdImageMetadataStash, error) { var imgMeta rbdImageMetadataStash fPath := filepath.Join(path, stashFileName) encodedBytes, err := ioutil.ReadFile(fPath) // #nosec - intended reading from fPath if err != nil { if !os.IsNotExist(err) { return imgMeta, fmt.Errorf("failed to read stashed JSON image metadata from path (%s): (%v)", fPath, err) } return imgMeta, util.JoinErrors(ErrMissingStash, err) } err = json.Unmarshal(encodedBytes, &imgMeta) if err != nil { return imgMeta, fmt.Errorf("failed to unmarshall stashed JSON image metadata from path (%s): (%v)", fPath, err) } return imgMeta, nil }
root@cld-dnode3-1091:/home/zhongjialiang# cat /home/cld/kubernetes/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/globalmount/image-meta.json {"Version":2,"pool":"kubernetes","image":"csi-vol-1699e662-e83f-11ea-8e79-246e96907f74","accessType":false,"encrypted":false}
2.detachRBDImageOrDeviceSpec
Pieced together the unmap command to perform unmap rbd/nbd device.
//ceph-csi/internal/rbd/rbd_attach.go func detachRBDImageOrDeviceSpec(ctx context.Context, imageOrDeviceSpec string, isImageSpec, ndbType, encrypted bool, volumeID string) error { if encrypted { mapperFile, mapperPath := util.VolumeMapper(volumeID) mappedDevice, mapper, err := util.DeviceEncryptionStatus(ctx, mapperPath) if err != nil { klog.Errorf(util.Log(ctx, "error determining LUKS device on %s, %s: %s"), mapperPath, imageOrDeviceSpec, err) return err } if len(mapper) > 0 { // mapper found, so it is open Luks device err = util.CloseEncryptedVolume(ctx, mapperFile) if err != nil { klog.Errorf(util.Log(ctx, "error closing LUKS device on %s, %s: %s"), mapperPath, imageOrDeviceSpec, err) return err } imageOrDeviceSpec = mappedDevice } } accessType := accessTypeKRbd if ndbType { accessType = accessTypeNbd } options := []string{"unmap", "--device-type", accessType, imageOrDeviceSpec} _, stderr, err := util.ExecCommand(ctx, rbd, options...) if err != nil { // Messages for krbd and nbd differ, hence checking either of them for missing mapping // This is not applicable when a device path is passed in if isImageSpec && (strings.Contains(stderr, fmt.Sprintf(rbdUnmapCmdkRbdMissingMap, imageOrDeviceSpec)) || strings.Contains(stderr, fmt.Sprintf(rbdUnmapCmdNbdMissingMap, imageOrDeviceSpec))) { // Devices found not to be mapped are treated as a successful detach util.TraceLog(ctx, "image or device spec (%s) not mapped", imageOrDeviceSpec) return nil } return fmt.Errorf("rbd: unmap for spec (%s) failed (%v): (%s)", imageOrDeviceSpec, err, stderr) } return nil }
3.cleanupRBDImageMetadataStash
Delete the metadata corresponding to the image, i.e. image meta JSON file.
//ceph-csi/internal/rbd/rbd_util.go func cleanupRBDImageMetadataStash(path string) error { fPath := filepath.Join(path, stashFileName) if err := os.Remove(fPath); err != nil { return fmt.Errorf("failed to cleanup stashed JSON data (%s): (%v)", fPath, err) } return nil }
Example of CEPH CSI component log
Operation: NodeUnstageVolume
Source: daemon: CSI rbdplugin, container: CSI rbdplugin
I0828 07:14:40.972279 3316053 utils.go:159] ID: 12126 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 GRPC call: /csi.v1.Node/NodeUnstageVolume I0828 07:14:40.973139 3316053 utils.go:160] ID: 12126 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 GRPC request: {"staging_target_path":"/home/cld/kubernetes/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/globalmount","volume_id":"0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74"} I0828 07:14:41.186119 3316053 cephcmds.go:60] ID: 12126 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 command succeeded: rbd [unmap --device-type krbd kubernetes/csi-vol-1699e662-e83f-11ea-8e79-246e96907f74] I0828 07:14:41.186171 3316053 nodeserver.go:690] ID: 12126 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 successfully unmounted volume (0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74) from staging path (/home/cld/kubernetes/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-14ee5002-9d60-4ba3-a1d2-cc3800ee0893/globalmount/0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74) I0828 07:14:41.187119 3316053 utils.go:165] ID: 12126 Req-ID: 0001-0024-0bba3be9-0a1c-41db-a619-26ffea20161e-0000000000000004-1699e662-e83f-11ea-8e79-246e96907f74 GRPC response: {}
So far, the analysis of RBD driver nodeserver has been completed. Here is a summary.
RBD driver nodeserver analysis summary
(1) nodeserver mainly includes NodeGetCapabilities, NodeGetVolumeStats, NodeStageVolume, NodePublishVolume, nodeuublishvolume, NodeUnstageVolume and NodeExpandVolume methods. Their functions are as follows:
NodeGetCapabilities: obtain the capability of CEPH CSI driver.
NodeGetVolumeStats: probe the state of the mounted storage and return the relevant metrics of the storage to kubelet.
NodeExpandVolume: perform corresponding operations on the node to synchronize the stored capacity expansion information to the node.
NodeStageVolume: Map RBD image to rbd/nbd device on node, and mount it to staging path after formatting.
NodePublishVolume: from staging path and mount in NodeStageVolume method to target path.
NodeUnpublishVolume: unmount the stagingPath to targetPath.
NodeUnstageVolume: first remove the mounting of targetPath to rbd/nbd device, and then unmap rbd/nbd device (that is, remove the mounting of node rbd/nbd device and ceph rbd image).
(2) Before kubelet calls NodeExpandVolume, NodeStageVolume, NodeUnstageVolume and other methods, it will first call NodeGetCapabilities to obtain the capability of the CEPH CSI driver to see whether it supports calling these methods.
(3) kubelet periodically calls NodeGetVolumeStats to obtain volume related indicators.
(4) Storage expansion is divided into two steps. The first step is csi's ControllerExpandVolume, which is mainly responsible for expanding the underlying storage; The second step is the NodeExpandVolume of csi. When the volumemode is filesystem, it is mainly responsible for synchronizing the capacity expansion information of the underlying rbd image to the rbd/nbd device to expand the xfs/ext file system; When the volumemode is block, the node side capacity expansion operation is not required.
(5) When an rbd image is attached to multiple pods on a node, the NodeStageVolume method will only be called once, and NodePublishVolume will be called multiple times, that is, in this case, there will be only one staging path and multiple target paths. You can understand that the staging path corresponds to rbd image and the target path corresponds to pod. Therefore, when an rbd image is mounted to multiple pods on a node, there is only one staging path and multiple target paths. The same is true for unmounting. The NodeUnstageVolume method will be called only when all pods of an rbd image are deleted.