-
Notifications
You must be signed in to change notification settings - Fork 1.4k
🐛 Fix MachinePool nodeRef UID mismatch after K8s upgrade #12392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
🐛 Fix MachinePool nodeRef UID mismatch after K8s upgrade #12392
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@jayesh-srivastava: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/area machinepool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems reasonable to add this additional verification in general, but I wonder if this problem has been seen in providers other than CAPZ.
// Validate that the UIDs in NodeRefs are still valid | ||
if s.nodeRefMap != nil { | ||
// Create a name-to-node mapping for efficient lookup | ||
nodeNameMap := make(map[string]*corev1.Node) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nodeNameMap := make(map[string]*corev1.Node) | |
nodeNameMap := make(map[string]*corev1.Node, len(s.nodeRefMap)) |
|
What this PR does / why we need it:
When a K8s upgrade is performed on a Managed cluster, new nodes will come up with new UIDs. However, the MachinePool controller has an early return condition that only validates the count of NodeRefs but doesn't check if the UIDs are still valid. This leads to MachinePools retaining stale NodeRef UIDs after upgrades, causing UID mismatches that persist until manual intervention.
This PR adds UID validation logic before the early return condition.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #12388