Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RemoteMachine Lifecycle #434

Open
infinitydon opened this issue Jan 30, 2024 · 2 comments
Open

RemoteMachine Lifecycle #434

infinitydon opened this issue Jan 30, 2024 · 2 comments

Comments

@infinitydon
Copy link

infinitydon commented Jan 30, 2024

Hi,

I will like to get your thoughts about the following about the RemoteMachine (not sure if some are in the roadmap though).

  • When a RemoteMachine is deleted at the infrastructure level, the status (using kubectl describe) does not reflect this. I think there should be some health check to know if any RemoteMachine is down or unreachable or if the k0s service is still running and healthy?
  • In the aspect of OS upgrades (not k0s upgrade itself) for the RemoteMachines, are there any plans to cater for this scenario because some OS upgrade involves a total replacement of the VM or formatting the baremetal server. For now a workaround that comes to my mind is to use a gitops controller for the RemoteMachine manifest so when the VM/baremetal comes up, the RemoteMachine resource can be deleted in the k8s and when deleted the gitops controller will kick in and re-create the RemoteMachine resource thereby initiating k0s installation.
  • Remote user: Will there be the possibility or is it currently possible to use a normal user + sudo instead of root to do the k0s installation?

Thanks

@jnummelin
Copy link
Member

When a RemoteMachine is deleted at the infrastructure level, the status (using kubectl describe) does not reflect this.

That's kinda expected, k0smotron cannot know when that kinda thing happens.

there should be some health check to know if any RemoteMachine is down or unreachable or if the k0s service is still running and healthy?

I believe you should be able to do this with MachineHealthCheck object. If I read the docs and the flowchart correctly, CAPI will delete the node if it's seen offline for long enough in the child cluster API.

In the aspect of OS upgrades (not k0s upgrade itself) for the RemoteMachines, are there any plans to cater for this scenario because some OS upgrade involves a total replacement of the VM or formatting the baremetal server.

No plans for OS upgrades. OS management is bit beyond the scope for k0smotron RM controller.

For now a workaround that comes to my mind is to use a gitops controller for the RemoteMachine manifest so when the VM/baremetal comes up, the RemoteMachine resource can be deleted in the k8s and when deleted the gitops controller will kick in and re-create the RemoteMachine resource thereby initiating k0s installation.

Sounds pretty good to me.

Remote user: Will there be the possibility or is it currently possible to use a normal user + sudo instead of root to do the k0s installation?

Remote user: Will there be the possibility or is it currently possible to use a normal user + sudo instead of root to do the k0s installation?

Hmm, that is interesting point. I think the challenge is that bootstrap controllers, the ones who create the cloud-init, are creating plain cloud-init. And in general, cloud-init is being run as root. So the command etc. in the created cloud-init are NOT done using sudo.

Maybe it would be possible for RM controller to prepend sudo for each command it runs. RM object could have something like useSudo: true option to trigger this. WDYT?

@infinitydon
Copy link
Author

Thanks @jnummelin for the response, as per the remote user, this is exactly what I have in mind also, having the possibility to set sudo flag

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants