-
Notifications
You must be signed in to change notification settings - Fork 819
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement ListVolumes #464
Comments
cc @jsafrane |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
I am thinking about this case: an EBS volume has been attached to an AWS instance via @msau42 Do you think the latest csi-attacher can help resolve the above problem? |
The latest csi-attacher does not help with that case. It only helps for the case where the disk actually got detached out of band and needs to be reattached to the same node. For your case, assuming the instance is down and then the Kubernetes Node gets deleted as a result, then the volume will be force detached from the node after 5 minutes, and then the new Pod will be able to start up. Some discussion if we can recover better: kubernetes-csi/external-attacher#215 |
Thanks @msau42 for your reply!
Could you please let me know who will be responsible for forcily detaching the volume after 5 mins? Is it kube-controller-manager to create a |
It's the attach detach controller in kube-controller-manager. This is where it waits for 5 minutes if it thinks the volume may still be mounted on the node: https://github.com/kubernetes/kubernetes/blob/1faf097f3f7294322a574d2c813d21657ab61a81/pkg/controller/volume/attachdetach/reconciler/reconciler.go#L173 After the timeout, then we proceed to detach. However, there's an interesting case here where we skip calling Detach if we failed to update the Node status (maybe because the Node is gone): I need to verify if Detach is properly called in this case or if we end up leaking the VolumeAttachment. |
@msau42 I think the timeout you are talking about is
So the reconciler will wait for |
/remove-lifecycle stale |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Is your feature request related to a problem?/Why is this needed
When a Node gets deleted, and Pod is rescheduled, Kubernetes thinks that the the volume is still attached to the original node and fails to attach to the new node with Multi-Attach error.
/feature
Describe the solution you'd like in detail
The latest csi-attacher v2.1 adds support for the ListVolumes capability: https://github.com/kubernetes-csi/external-attacher/blob/v2.1.1/CHANGELOG-2.1.md
It addresses problems when volumes may get detached out of band, and need to be reattached to a node. It could potentially help the other way around too, but that hasn't been tested.
Describe alternatives you've considered
n/a
Additional context
n/a
The text was updated successfully, but these errors were encountered: