-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mark volume detach as success when Node vm is deleted from vcenter #1879
mark volume detach as success when Node vm is deleted from vcenter #1879
Conversation
5d48733
to
5571cf5
Compare
This seems to be a reasonable fix. |
/approve |
The code change looks good to me. Can you run the CI jobs? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: chethanv28, divyenpatel, xing-yang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@@ -1247,8 +1247,14 @@ func (c *controller) ControllerUnpublishVolume(ctx context.Context, req *csi.Con | |||
node, err = c.nodeMgr.GetNodeByName(ctx, req.NodeId) | |||
} | |||
if err != nil { | |||
return nil, csifault.CSIInternalFault, logger.LogNewErrorCodef(log, codes.Internal, | |||
"failed to find VirtualMachine for node:%q. Error: %v", req.NodeId, err) | |||
if err == cnsvsphere.ErrVMNotFound { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we can infer that the disk is detached when we get ErrVmNotFound. A better way to be sure is to check if the volume is attached to any VM. FCD has an API for that but that API is currently not exposed by CNS. Can CNS expose this API? Once available, CSI can invoke that to be sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we can infer that the disk is detached when we get ErrVmNotFound.
@SandeepPissay so are you saying the searchIndex.FindByUuid
API call can not determine if VM is deleted from VC Inventory.
if VM is not found in the VC inventory then what is holding us from letting k8s know that this volume can be marked as detached from the requested node. Why do we need to check if FCD is attached to any other VM? Here we are making a change in the ControllerUnpublishVolume
call.
svm, err := searchIndex.FindByUuid(ctx, dc.Datacenter, uuid, true, &instanceUUID)
if err != nil {
log.Errorf("failed to find VM given uuid %s with err: %v", uuid, err)
return nil, err
} else if svm == nil {
log.Errorf("Couldn't find VM given uuid %s", uuid)
return nil, ErrVMNotFound
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also for Supervisor Pod VM, we have relied on this API.
Refer to https://github.com/kubernetes-sigs/vsphere-csi-driver/pull/1702/files
then why we can not use this API for Vanilla Node VM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
searchIndex.FindByUuid
returns from the VC inventory and it could return an incorrect result when it is stale. IIRC the VC inventory can be stale in a few cases (VC restored to backup, host sync issues, etc). We should confirm this with the VC team. Instead of VM existence in VC inventory, its better to have a reliable way to know whether the disk is really attached to any VM or not. And this operation should actually check at the ESX host layer. If such an API does not exist today, we should file a request. As far as this PR goes, we can commit with the above caveats.
Started vanilla Block pipeline... Build Number: 1337 |
|
Started vanilla Block pipeline... Build Number: 1350 |
|
@SandeepPissay I have executed e2e pipeline - #1879 (comment) and also updated manual test result on the PR for deletion of Node VM and automatic removal of Volumeattachment without manual intervention. |
/lgtm |
Maybe we can learn from this: https://github.com/kubernetes-sigs/aws-ebs-csi-driver/blob/release-1.29/cmd/hooks/prestop.go |
What this PR does / why we need it:
Handle case to detach volume when Node VM is deleted from VC inventory.
Which issue this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close that issue when PR gets merged): fixes #Fixes the known issue present in the vSphere CSI Driver
Refer to the Issue
Persistent volume fails to be detached from a node
in the release note - https://docs.vmware.com/en/VMware-vSphere-Container-Storage-Plug-in/2.6/rn/vmware-vsphere-container-storage-plugin-26-release-notes/index.htmlTesting done:
Log
Special notes for your reviewer:
Release note: