cloud-init intermittent to fail rename network interface #5916
Labels
bug
Something isn't working correctly
incomplete
Action required by submitter
new
An issue that still needs triage
Bug report
On Azurelinux with cloud-init 24.3 we are running test and seeing intermittent issue where cloud-init failed to rename the network interface. I found it is may due to a racing condition between cloud-init and systemd-networkd. Cloud-init renaming command: cloud-init failed to run cmd ['ip', 'link', 'set', 'eth7', 'name', 'eth1'] due to there is still an eth1 exists after cloud-init already renaming eth1 to cirename0. Cloud-init gets eventually errors out due to unexpected two eth1 exist. From the systemd logs I am seeing systemd and cloud-init trying to apply the network config and renaming at same time so I suspect this is how the race condition came along:
Steps to reproduce the problem
Intermittent test failed where multiple nics were populated
Environment details
cloud-init logs
2024-11-18 08:15:04,840 - subp.py[DEBUG]: Running command ['ip', '-6', 'addr', 'show', 'permanent', 'scope', 'global'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:04,842 - subp.py[DEBUG]: Running command ['ip', '-4', 'addr', 'show'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:04,843 - net[DEBUG]: Detected interfaces {'eth6': {'downable': True, 'device_id': '0x3', 'driver': 'hv_netvsc', 'mac': '60:45:bd:ae:33:e3', 'name': 'eth6', 'up': False}, 'eth4': {'downable': True, 'device_id': '0x3', 'driver': 'hv_netvsc', 'mac': '60:45:bd:ae:38:a6', 'name': 'eth4', 'up': False}, 'lo': {'downable': False, 'device_id': None, 'driver': None, 'mac': '00:00:00:00:00:00', 'name': 'lo', 'up': True}, 'eth2': {'downable': True, 'device_id': '0x3', 'driver': 'hv_netvsc', 'mac': '60:45:bd:ae:3d:44', 'name': 'eth2', 'up': False}, 'eth0': {'downable': True, 'device_id': '0x3', 'driver': 'hv_netvsc', 'mac': '60:45:bd:ae:38:38', 'name': 'eth0', 'up': False}, 'eth7': {'downable': True, 'device_id': '0x3', 'driver': 'hv_netvsc', 'mac': '60:45:bd:ae:3d:59', 'name': 'eth7', 'up': False}, 'eth5': {'downable': True, 'device_id': '0x3', 'driver': 'hv_netvsc', 'mac': '60:45:bd:ae:3b:fe', 'name': 'eth5', 'up': False}, 'eth3': {'downable': True, 'device_id': '0x3', 'driver': 'hv_netvsc', 'mac': '60:45:bd:ae:3a:c8', 'name': 'eth3', 'up': False}, 'eth1': {'downable': True, 'device_id': '0x3', 'driver': 'hv_netvsc', 'mac': '60:45:bd:ae:3b:81', 'name': 'eth1', 'up': False}}
2024-11-18 08:15:04,844 - net[DEBUG]: Renamed [['60:45:bd:ae:38:38', 'eth0', 'hv_netvsc', '0x3'], ['60:45:bd:ae:3d:59', 'eth1', 'hv_netvsc', '0x3'], ['60:45:bd:ae:3b:fe', 'eth2', 'hv_netvsc', '0x3'], ['60:45:bd:ae:3d:44', 'eth3', 'hv_netvsc', '0x3'], ['60:45:bd:ae:3a:c8', 'eth4', 'hv_netvsc', '0x3'], ['60:45:bd:ae:38:a6', 'eth5', 'hv_netvsc', '0x3'], ['60:45:bd:ae:33:e3', 'eth6', 'hv_netvsc', '0x3'], ['60:45:bd:ae:3b:81', 'eth7', 'hv_netvsc', '0x3']] with ops [('rename', '60:45:bd:ae:3d:59', 'eth1', ('eth1', 'cirename0')), ('rename', '60:45:bd:ae:3d:59', 'eth1', ('eth7', 'eth1')), ('rename', '60:45:bd:ae:3b:fe', 'eth2', ('eth2', 'cirename1')), ('rename', '60:45:bd:ae:3b:fe', 'eth2', ('eth5', 'eth2')), ('rename', '60:45:bd:ae:3d:44', 'eth3', ('eth3', 'cirename2')), ('rename', '60:45:bd:ae:3d:44', 'eth3', ('cirename1', 'eth3')), ('rename', '60:45:bd:ae:3a:c8', 'eth4', ('eth4', 'cirename3')), ('rename', '60:45:bd:ae:3a:c8', 'eth4', ('cirename2', 'eth4')), ('rename', '60:45:bd:ae:38:a6', 'eth5', ('cirename3', 'eth5')), ('rename', '60:45:bd:ae:3b:81', 'eth7', ('cirename0', 'eth7'))]
2024-11-18 08:15:04,844 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'eth1', 'name', 'cirename0'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:04,876 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'eth7', 'name', 'eth1'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:04,880 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'eth2', 'name', 'cirename1'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:04,912 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'eth5', 'name', 'eth2'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:04,944 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'eth3', 'name', 'cirename2'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:04,972 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'cirename1', 'name', 'eth3'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:05,008 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'eth4', 'name', 'cirename3'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:05,032 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'cirename2', 'name', 'eth4'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:05,060 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'cirename3', 'name', 'eth5'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:05,096 - subp.py[DEBUG]: Running command ['ip', 'link', 'set', 'cirename0', 'name', 'eth7'] with allowed return codes [0] (shell=False, capture=True)
2024-11-18 08:15:05,098 - stages.py[WARNING]: Failed to rename devices: Failed to apply network config names: [unknown] Error performing rename('eth7', 'eth1') for 60:45:bd:ae:3d:59, eth1: Unexpected error while running command.
Command: ['ip', 'link', 'set', 'eth7', 'name', 'eth1']
Exit code: 2
Reason: -
Stdout:
Stderr: RTNETLINK answers: File exists
[unknown] Error performing rename('cirename0', 'eth7') for 60:45:bd:ae:3b:81, eth7: Unexpected error while running command.
Command: ['ip', 'link', 'set', 'cirename0', 'name', 'eth7']
Exit code: 2
Reason: -
Stdout:
Stderr: RTNETLINK answers: File exists
cloud-init.log
Cloud-init -output.log
The text was updated successfully, but these errors were encountered: