Skip to content

Conversation

hanwen-cluster
Copy link
Contributor

@hanwen-cluster hanwen-cluster commented Sep 19, 2025

Description of changes

systemd-networkd is used by default with Ubuntu Server. Installing ubuntu-desktop (as part of DCV installation) installs NetworkManager. NetworkManager is more complex (with WiFi capabilities) and causes confusion to systemd-networkd. When systemd-networkd is confused, it delays system boot by 2 minutes.

This commit instructs NetPlan to use systemd-networkd to manage network interfaces. The code is added at the end of DCV installation because the mitigation is strictly related to the installation of ubuntu-desktop. Always using systemd-networkd also improves consistency between how ParallelCluster handles single-nic instances vs multi-nic instances. With multi-nic instances ParallelCluster has been instructing netplan to use systemd-networkd (code)

Technical details:

Output of networkctl list

Prior to this commit

Base Ubuntu:

IDX LINK TYPE     OPERATIONAL SETUP
  1 lo   loopback carrier     unmanaged
  2 ens5 ether    routable    configured

2 links listed.

Ubuntu with ubuntu-desktop

IDX LINK TYPE     OPERATIONAL SETUP
  1 lo   loopback carrier     unmanaged
  2 ens5 ether    routable    unmanaged

2 links listed.

systemd-networkd got confused because it saw no network interface was setup (because NetworkManager took over control of all network interfaces) and waited until 2 minutes timeout at the beginning of system boot:

$ journalctl -b | grep -i "ipv6\|timeout\|waiting"
...
Sep 18 14:51:23 systemd-networkd-wait-online[1602]: Timeout occurred while waiting for network connectivity.
...
Sep 18 14:53:31 systemd-networkd-wait-online[1891]: Timeout occurred while waiting for network connectivity.
...

After this commit

Ubuntu with ubuntu-desktop has the same output as Base Ubuntu and the delay is gone

Tests

  • Test with DCV has passed on both Ubuntu 22 and 24
  • Test with multiple-NICs instances has passed on both Ubuntu 22 and 24

References

  • Link to impacted open issues.
  • Link to related PRs in other packages (i.e. cookbook, node).
  • Link to documentation useful to understand the changes.

Checklist

  • Make sure you are pointing to the right branch.
  • If you're creating a patch for a branch other than develop add the branch name as prefix in the PR title (e.g. [release-3.6]).
  • Check all commits' messages are clear, describing what and why vs how.
  • Make sure to have added unit tests or integration tests to cover the new/modified code.
  • Check if documentation is impacted by this change.

Please review the guidelines for contributing and Pull Request Instructions.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

systemd-networkd is used by default with Ubuntu Server. Installing ubuntu-desktop (as part of DCV installation) installs NetworkManager. NetworkManager is more complex (with WiFi capabilities) and causes confusion to systemd-networkd. When systemd-networkd is confused, it delays the boot by 2 minutes.

This commit instructs NetPlan to use systemd-networkd to manage network interfaces. The code is added at the end of DCV installation because the mitigation is strictly related to the installation of ubuntu-desktop. Always using systemd-networkd also improves consistency between how ParallelCluster handles single-nic instances vs multi-nic instances. With multi-nic instances ParallelCluster has been instructing netplan to use systemd-networkd ([code](https://github.com/aws/aws-parallelcluster-cookbook/blob/develop/cookbooks/aws-parallelcluster-environment/files/ubuntu/network_interfaces/configure_nw_interface.sh#L62))

# Technical details:
## Output of `networkctl list`
### Prior to this commit
Base Ubuntu:
```
IDX LINK TYPE     OPERATIONAL SETUP
  1 lo   loopback carrier     unmanaged
  2 ens5 ether    routable    configured

2 links listed.
```
Ubuntu with ubuntu-desktop
```
IDX LINK TYPE     OPERATIONAL SETUP
  1 lo   loopback carrier     unmanaged
  2 ens5 ether    routable    unmanaged

2 links listed.
```
systemd-networkd got confused because it saw no network interface was setup (because NetworkManager took over control of all network interfaces) and waited until 2 minutes timeout at the beginning of system boot:
```
$ journalctl -b | grep -i "ipv6\|timeout\|waiting"
...
Sep 18 14:51:23 systemd-networkd-wait-online[1602]: Timeout occurred while waiting for network connectivity.
...
Sep 18 14:53:31 systemd-networkd-wait-online[1891]: Timeout occurred while waiting for network connectivity.
...
```
### After this commit
Ubuntu with ubuntu-desktop has the same output as Base Ubuntu and the delay is gone
Signed-off-by: Hanwen <[email protected]>
@hanwen-cluster hanwen-cluster requested review from a team as code owners September 19, 2025 18:54
@hanwen-cluster hanwen-cluster changed the base branch from develop to release-3.14 September 19, 2025 18:56
@hgreebe
Copy link
Contributor

hgreebe commented Sep 19, 2025

Is there any other side effect to having it use systemd-networkd instead of Network Manager or is it that Network Manager should have never been used for this purpose?

@hanwen-cluster
Copy link
Contributor Author

hanwen-cluster commented Sep 19, 2025

Network Manager should have never been used for cloud instances. Mainly because cloud instances don't have WiFi.

I don't think we should be concerned about any side-effect, because we already configure multi-nics instances with systemd-networkd

@hanwen-cluster hanwen-cluster enabled auto-merge (rebase) September 19, 2025 19:11
@hanwen-cluster hanwen-cluster merged commit e81104a into aws:release-3.14 Sep 19, 2025
28 of 30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants