Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster_yaml is not cleaned up #5011

Open
cg505 opened this issue Mar 21, 2025 · 0 comments
Open

cluster_yaml is not cleaned up #5011

cg505 opened this issue Mar 21, 2025 · 0 comments
Labels

Comments

@cg505
Copy link
Collaborator

cg505 commented Mar 21, 2025

During review for #4980, it was noted that CloudVmRayBackend.remove_cluster_config has a bug that prevents it from actually deleting the config:

handle.cluster_yaml = None
global_user_state.update_cluster_handle(handle.cluster_name, handle)
common_utils.remove_file_if_exists(handle.cluster_yaml) # but cluster_yaml was already set to None...

The fix is obvious, but we should be careful since it could introduce a leak if other code was unintentionally relying on this.

While looking at this, I noticed in post_teardown_cleanup that we call remove_cluster_config in the terminate case

self.remove_cluster_config(handle)
and then later there is an if statement that depends on handle.cluster_yaml...
if handle.cluster_yaml is not None:
_detect_abnormal_non_terminated_nodes(handle)
So we will never hit this.

We need to look carefully at this function and actually specify the expected invariant for when handle.cluster_yaml will/won't be set and when the file will/won't exist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants