OCPBUGS-88738: clean up orphaned mirrored ConfigMaps on NodePool deletion#8890
OCPBUGS-88738: clean up orphaned mirrored ConfigMaps on NodePool deletion#8890vsolanki12 wants to merge 1 commit into
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
Skipping CI for Draft Pull Request. |
|
@vsolanki12: This pull request references Jira Issue OCPBUGS-88738, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository YAML (base), Central YAML (inherited) Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughThe change refines mirrored KubeletConfig ConfigMap cleanup in Sequence Diagram(s)sequenceDiagram
participant reconcileKubeletConfig
participant HostedClusterNamespace
participant GuestClusterNamespace
participant activeNodePools
reconcileKubeletConfig->>HostedClusterNamespace: list KubeletConfig ConfigMaps
HostedClusterNamespace-->>reconcileKubeletConfig: ConfigMaps with NodePoolLabel
reconcileKubeletConfig->>activeNodePools: record active NodePools
reconcileKubeletConfig->>GuestClusterNamespace: inspect mirrored ConfigMaps
GuestClusterNamespace-->>reconcileKubeletConfig: mirrored ConfigMap + NodePoolLabel
reconcileKubeletConfig->>activeNodePools: check owning NodePool
activeNodePools-->>reconcileKubeletConfig: active / missing
reconcileKubeletConfig->>GuestClusterNamespace: preserve or delete mirrored ConfigMap
🚥 Pre-merge checks | ✅ 11✅ Passed checks (11 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: vsolanki12 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@vsolanki12: This pull request references Jira Issue OCPBUGS-88738, which is valid. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go`:
- Around line 3022-3027: The NodePool activity detection in resources.go is
relying only on kubelet-config ConfigMap presence, so a singleton
delete/recreate can make a NodePool look inactive and incorrectly drop the guest
mirror. Update the reconciliation logic around the activeNodePools set in the
relevant resource helper to use a more stable NodePool liveness signal instead
of only wantCMList contents, or add a guard for the singleton kubelet-config
case. Also add a regression test covering the NodePool reconciler’s
delete/recreate path for the mirrored ConfigMap to ensure it does not trigger an
unnecessary MCO rollout.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 5a6b67a8-12ca-43ab-8395-6cb220e302e4
📒 Files selected for processing (2)
control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.gocontrol-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.go
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8890 +/- ##
==========================================
+ Coverage 43.26% 43.35% +0.08%
==========================================
Files 770 771 +1
Lines 95479 95545 +66
==========================================
+ Hits 41311 41419 +108
+ Misses 51284 51242 -42
Partials 2884 2884
... and 9 files with indirect coverage changes
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
fe8b62a to
8adaf28
Compare
|
/test ci/prow/images |
|
@vsolanki12: The specified target(s) for The following commands are available to trigger optional jobs: Use DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Confirmed: all 5 Go build targets completed without errors (lines 107-111), and the build moved to This is a CI infrastructure issue — a transient 502 Bad Gateway error when pulling the Test Failure Analysis CompleteJob Information
Test Failure AnalysisErrorSummaryThe Root CauseThis is a CI infrastructure flake, not a product or code issue. The
The PR's code changes (cleaning up orphaned mirrored ConfigMaps on NodePool deletion) are not related to the failure. The three other image builds in the same job ( Recommendations
Evidence
|
|
I have tested in my test cluster Before fix: After Fix:
Both CMs exist before HCCO reconcile After HCCO reconcile, orphan deleted, valid CM preserved |
| }, | ||
| expectedHostedClusterObjects: []client.Object{}, | ||
| }, | ||
| { |
There was a problem hiding this comment.
nit: Consider adding a multi-NodePool test case that exercises the selectivity of activeNodePools across two NodePools in a single reconcile pass — e.g. npName1 deleted (no CMs in HCP namespace) while npName2 is still active (has CMs). Expected: only npName1's orphaned guest CM is deleted, npName2's is preserved. npName2 is already declared at line 1598 and available for this.
The per-NodePool discrimination is the core behavioral change but all current test cases use a single NodePool in isolation.
There was a problem hiding this comment.
thank you, I have updated as per the suggestion. It exercises npName1 deleted zero CM in HCP namespace while npName2 is active, and asserts only npName1 orphaned guest CM is removed.
…eletion The guard added in PR openshift#8672 unconditionally skips deletion of guest-side ConfigMaps with NTOMirroredConfigLabel, preventing spurious MCO rollouts when the source CM is transiently absent. However, this also preserves CMs whose owning NodePool has been permanently deleted. Derive NodePool existence from the wantCMList already fetched from the HCP namespace: when a NodePool is deleted, its finalizer removes all its CMs, so zero CMs for a given NodePool means it has been deleted. Build an activeNodePools set and only skip deletion when the owning NodePool is still active. Signed-off-by: Vimal Solanki <vsolanki@redhat.com>
8adaf28 to
f491611
Compare
|
@vsolanki12: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| log.Info("deleting orphaned mirrored ConfigMap; owning NodePool no longer exists", | ||
| "configMap", client.ObjectKeyFromObject(cm).String(), "nodePool", npName) | ||
| } | ||
| log.Info("delete mirror config ConfigMap", "config", client.ObjectKeyFromObject(cm).String()) |
There was a problem hiding this comment.
Nit: this falls through from the orphaned-mirrored path above, so orphaned-CM deletions emit two log lines while the other paths each emit one. Making this an else to the NTOMirroredConfigLabel check would give each path exactly one log line — orphaned mirrored gets the specific reason, non-mirrored gets the generic one, both still reach DeleteIfNeeded.
What this PR does / why we need it:
PR #8672 (OCPBUGS-86949) added a guard in HCCO's
reconcileKubeletConfigthat unconditionally skips deletion of guest-side ConfigMaps withNTOMirroredConfigLabel. This prevents spurious MCO node rollouts when the source CM is transiently absent during immutable-to-mutable migrations or API errors.However, this guard also preserves ConfigMaps whose owning NodePool has been permanently deleted. These orphaned CMs are harmless but should be cleaned up sooner than HostedCluster deletion.
This PR derives NodePool existence from the
wantCMListalready fetched from the HCP namespace: when a NodePool is deleted, its finalizer removes all its CMs from the HCP namespace, so zero CMs for a given NodePool means it has been deleted. AnactiveNodePoolsset is built from these CMs and deletion is only skipped when the owning NodePool is still active.Behavior matrix:
Which issue(s) this PR fixes:
Fixes OCPBUGS-88738
Special notes for your reviewer:
DeleteAllOfto remove all its CMs from the HCP namespace before the NodePool object is deletedChecklist:
Summary by CodeRabbit