Skip to content

Conversation

@alexnovak
Copy link
Contributor

Terminates the fuse mount on server termination. Without this - pods can potentially fail to get cleaned up by the kubelet due to the remaining FUSE mount.

@EdSchouten
Copy link
Member

This issue has already been discussed extensively in multiple places (e.g., buildbarn/bb-clientd#2, Slack threads). Let me get straight to the point: I am not planning on merging a change like this. The reasons are as follows:

  • In general, leaving the stale FUSE mount behind is in fact desirable, as it effectively causes the mount path to be marked as dead. If an unmount would take place and the underlying directory happens to be writable, any lingering build actions would unintentionally create new output files into the directory below the mount. These files would subsequently become hidden when bb_worker is restarted and a new mount is created on top. This might make a systems administrator wonder where disk space has gone.

  • The change you've provided is incomplete, as there is no absolute guarantee that this code path gets run. If Kubernetes SIGKILLs bb_worker and then destroys the pod, you still end up locking up Kubelet. Coming up with a 'best effort' workaround for something for which a reliable fix can be designed is in my opinion poor taste.

Effort spent writing this fix could also have been spent root causing the actual source of the lockups in Kubelet. It would also make you a rockstar, as it would also fix many other use cases people have using FUSE on Kubernetes. See kubernetes/kubernetes#7890 for more details.

@EdSchouten EdSchouten closed this Jan 8, 2025
@EdSchouten
Copy link
Member

EdSchouten commented Jan 8, 2025

Sidenote: the only reason we do perform unmounts for NFSv4 mounts is because stale NFS mounts are far worse than the ones left behind by FUSE. Because NFS is a network based protocol, the kernel will end up performing timed retries for a prolonged period of time, causing processes in user space to hang. This differs from FUSE, where any attempt to access a stale FUSE mount causes system calls to immediately fail with ENXIO.

I wouldn't be amazed if these lockups in Kubelet are merely caused by some retry loop in Kubelet itself, which fails to account for stale FUSE mounts being present. All that likely needs to be done is inject some umount() system call in between those retries to guarantee forward progress.

@alexnovak
Copy link
Contributor Author

Thanks for your response! Sorry if I touched a nerve - I made the tweak pretty quickly after seeing the issue while testing FUSE in my cluster, and seeing the TODO note. I appreciate the added context here. Would it be more helpful if I updated the TODO to mark this as deliberate?

@EdSchouten
Copy link
Member

Sure thing! Go for it!

@EdSchouten EdSchouten reopened this Jan 8, 2025
@EdSchouten
Copy link
Member

Giving it some more thought: I think that the TODO in the code is still valid. This is about shutting down the FUSE server goroutine gracefully. I think that still applies, and the TODO should remain intact. That's distinct from unmounting.

Documenting explicitly why we don't unmount seems fine though.

@alexnovak
Copy link
Contributor Author

I've submitted kubernetes/kubernetes#129550 after combing through some of the kubernetes source and (I believe) identified the offending lines.

Making sure I understand the intent of the original TODO, it doesn't looks like Serve offers any particularly graceful way of handling context delivered to the server. In its core loop - it looks like the only ways to convince it to stop are with an unmount, or an unhandled error
https://github.com/hanwen/go-fuse/blob/v2.5.1/fuse/server.go#L474-L503

The exit towards the top of the function seems to be a carve-out so they can limit the concurrency, and doesn't seem to be reachable through any public function. I'm guessing in your ideal world - we'd have another way to trigger exit behavior here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants