Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wait for network config before activating libvirt socket #75

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

DavidFair
Copy link

Wait for the network configuration to complete before trying to activate the libvirt socket. This prevents systemd binding to the specified IP, then having the interface either come up or reconfigure.

Due to the race-y nature of the network config (at least with NetworkManager) and the socket activation this would mean <50% of the machines rebooting would be affected. After this point the socket is "up" from systemd's POV however anything (including telnet) trying to open the port will find it closed until an administrator restarts the socket unit

Wait for the network configuration to complete before trying to activate
the libvirt socket. This prevents systemd binding to the specified IP,
then having the interface either come up or reconfigure.

Due to the race-y nature of the network config (at least with
NetworkManager) and the socket activation this would mean <50% of the
machines rebooting would be affected. After this point the socket is
"up" from systemd's POV however anything (including telnet) trying to
open the port will find it closed until an administrator restarts the
socket unit
@DavidFair DavidFair requested a review from a team as a code owner December 20, 2024 14:51
@DavidFair
Copy link
Author

Some more logs for context - I suspect D-bus restarting NM is causing the interface to go down and up, which causes the open socket to reset without systemd noticing or rebinding:

Dec 20 08:57:10 hv.example.com NetworkManager[3565]: <info>  [1734685030.6777] manager: (p-br0-o): new Veth device (/org/freedesktop/NetworkManager/Devices/9)
Dec 20 08:57:10 hv.example.com NetworkManager[3565]: <info>  [1734685030.6782] device (p-br0-o): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external')
Dec 20 08:57:10 hv.example.com NetworkManager[3565]: <info>  [1734685030.6797] bus-manager: acquired D-Bus service "org.freedesktop.NetworkManager"
Dec 20 08:57:10 hv.example.com NetworkManager[3565]: <info>  [1734685030.6798] caught SIGTERM, shutting down normally.
Dec 20 08:57:10 hv.example.com NetworkManager[3565]: <info>  [1734685030.6879] exiting (success)
Dec 20 08:57:10 hv.example.com systemd[1]: NetworkManager.service: Deactivated successfully.
Dec 20 08:57:10 hv.example.com systemd[1]: Stopped Network Manager.
Dec 20 08:57:10 hv.example.com systemd[1]: Starting Network Manager...
Dec 20 08:57:10 hv.example.com NetworkManager[4343]: <info>  [1734685030.7462] NetworkManager (version 1.48.10-2.el9_5) is starting... (after a restart, boot:67fa3089-ef37-42ab-a798-c4fa>
Dec 20 08:57:10 hv.example.com NetworkManager[4343]: <info>  [1734685030.7464] Read config: /etc/NetworkManager/NetworkManager.conf 
...
Dec 20 08:57:10 hv.example.com NetworkManager[4343]: <info>  [1734685030.7753] manager: (br0): new Bridge device (/org/freedesktop/NetworkManager/Devices/6)
Dec 20 08:57:10 hv.example.com NetworkManager[4343]: <info>  [1734685030.7760] manager: (br0.1602): new VLAN device (/org/freedesktop/NetworkManager/Devices/7)
...
Dec 20 08:57:10 hv.example.com NetworkManager[4343]: <info>  [1734685030.8790] device (br0): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed')
Dec 20 08:57:10 hv.example.com NetworkManager[4343]: <info>  [1734685030.8792] device (br0): Activation: successful, device activated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant