-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
devices/console: Fix a bug which would cause libkrun to hang on exit #261
Conversation
fn wait_until_readable(&self, _stopfd: Option<&EventFd>) { | ||
let mut poll_fds = [PollFd::new(self.sigint_evt.as_raw_fd(), PollFlags::POLLIN)]; | ||
fn wait_until_readable(&self, stopfd: Option<&EventFd>) { | ||
let mut poll_fds = Vec::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let mut poll_fds = Vec::new(); | |
let mut poll_fds = Vec::with_capacity(2); |
maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
(Side note/rant: The performance doesn't really matter here, but it's a shame Rust has no easy way to just stack allocate a fixed capacity vector using only the standard library, which would be the best choice in situations like this 😞 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it need to be a vector? If the size is fixed, is [PollFd; 2]
stack allocated? I believe so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it need to be a vector? If the size is fixed, is [PollFd; 2] stack allocated? I believe so.
Well, the size is not fixed, only the capacity is. I could use [PollFd; 2]
but then if you want to poll only 1 descriptor you have to initialize the other one to some invalid PollFd
and conditionally pass a slice of either 1 or 2 values into the poll(). Or you could just have 2 different calls to the poll
function in an if-else.
There is also another place where we do the same - impl PortInput for PortInputFd
:
libkrun/src/devices/src/virtio/console/port_io.rs
Lines 97 to 104 in 57a5a6b
fn wait_until_readable(&self, stopfd: Option<&EventFd>) { | |
let mut poll_fds = Vec::new(); | |
poll_fds.push(PollFd::new(self.as_raw_fd(), PollFlags::POLLIN)); | |
if let Some(stopfd) = stopfd { | |
poll_fds.push(PollFd::new(stopfd.as_raw_fd(), PollFlags::POLLIN)); | |
} | |
poll(&mut poll_fds, -1).expect("Failed to poll"); | |
} |
I looked into this more, and it seems the argument in the trait stopfd: Option<&EventFd>
, doesn't really have to be an Option and we can always pass a StopFd
. I'd rather change it in a separate refactoring/cleanup PR along with some other changes to the console device though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair, I'm mainly just commenting on the heap allocation of Vec
. Sure, there'd be some extra logic involved (perhaps you could also try [Option<PollFd>; 2]
but that also requires some logic in the handlers) but it may give you the stack allocation you're looking for.
Make PortInputEmpty and PortInputSigInt `wait_until_readable` implementations poll the stopfd EventFd to ensure the thread eventually exits. (The thread is joined in Port::shutdown method causing the hang). To replicate the hang, the stdin must not be a terminal. For example you can use a pipe - `echo hello | ./chroot_vm rootfs_fedora /bin/cat` would hang without this fix. This seems to be a regresion introduced by the commit: 4076b7: devices/console: implement reset method Signed-off-by: Matej Hrica <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed to fix the issue here. Thanks!
This fixes a hang observed when the VM is about to exit. The problem was the console device was joining a thread on vm shutdown (or device reset) which would never exit.
To replicate the hang, the stdin must not be a terminal (for example you can use a pipe).
Before this fix, this would hang:
echo hello | ./chroot_vm rootfs_fedora /bin/cat
(Btw libkrun used 100% (1 core) of CPU when it was stuck in this state - I don't know why exactly.)
This seems to be a regression introduced by the commit:
4076b7
: devices/console: implement reset method