-
Couldn't load subscription status.
- Fork 48
OCPCLOUD-2636: Warn users when attempting to change Authoritative API but Synchronized condition is false #398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPCLOUD-2636: Warn users when attempting to change Authoritative API but Synchronized condition is false #398
Conversation
working on tests
|
Skipping CI for Draft Pull Request. |
WalkthroughAdds a ValidatingAdmissionPolicy and Binding that warn when updating a machine's Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Test as Test Client
participant API as Kubernetes API Server
participant VAP as ValidatingAdmissionPolicy
participant WC as WarningCollector
Test->>API: PATCH Machine (.spec.authoritativeAPI)
API->>VAP: Evaluate policy for machine.openshift.io update
VAP->>VAP: Check Synchronized condition and authoritativeAPI change
alt Synchronized != true
VAP-->>API: Return Warn (message)
API-->>Test: Include warning header
API->>WC: Invoke configured WarningHandler
WC->>WC: Store message (thread-safe)
end
API-->>Test: Admit request (with warnings)
Test->>WC: Retrieve collected messages
WC-->>Test: Return messages for assertions
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.5.0)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@theobarberbany: This pull request references OCPCLOUD-2636 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
c775bf3 to
0f4b835
Compare
|
@theobarberbany: This pull request references OCPCLOUD-2636 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
0f4b835 to
8dd295e
Compare
|
@theobarberbany: This pull request references OCPCLOUD-2636 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (6)
manifests/0000_30_cluster-api_09_admission-policies.yaml (3)
286-301: Consider simplifying the variable logic.The
syncBadvariable has redundant logic sincesyncCondalready checks for the existence of conditions. The expression can be simplified:variables: - name: syncCond expression: > has(object.status.conditions) ? object.status.conditions.exists(c, c.type == "Synchronized") : false - name: syncBad expression: > - has(object.status.conditions) ? - object.status.conditions.exists(c, + variables.syncCond && + object.status.conditions.exists(c, c.type == "Synchronized" && (c.status == "False" || c.status == "Unknown") - ) : true + ) - name: authAPIChanged expression: > oldObject.spec.authoritativeAPI != object.spec.authoritativeAPIWith this change, the validation expression on line 305 would become:
expression: '!((!variables.syncCond || variables.syncBad) && variables.authAPIChanged)'This is clearer because
syncBadnow only represents "condition exists AND is bad", while the absence case is handled by!syncCond.
305-305: Consider inverting the validation logic for clarity.The double negation pattern
!((...) && ...)makes the expression harder to understand. While functionally correct, consider restructuring for readability:Alternative approach - change the validation to use a positive assertion:
validations: - - expression: '!(( !variables.syncCond || variables.syncBad ) && variables.authAPIChanged)' - message: 'Warning: updating .spec.authoritativeAPI when Synchronized=false on resource means changes may not take effect' + - expression: '!variables.authAPIChanged || (variables.syncCond && !variables.syncBad)' + message: 'Warning: updating .spec.authoritativeAPI when Synchronized=false on resource means changes may not take effect'This reads as: "pass if authAPI didn't change OR (sync condition exists AND is not bad)", which is more intuitive than the double-negative form.
307-307: Remove redundantvalidationActionsfrom policy.The
validationActionsis specified in both the policy (line 307) and the binding (line 320). Per Kubernetes documentation, when specified in the binding, it overrides the policy-level setting. The policy-level declaration is therefore redundant and should be removed to avoid confusion.validations: - expression: '!(( !variables.syncCond || variables.syncBad ) && variables.authAPIChanged)' message: 'Warning: updating .spec.authoritativeAPI when Synchronized=false on resource means changes may not take effect' - validationActions: [Warn] ---pkg/admissionpolicy/testutils/util.go (2)
276-281: Document why parameters are ignored, or consider capturing them.The
HandleWarningHeaderWithContextmethod ignores thecodeandagent(header) parameters. While this may be intentional for simplicity in testing, consider whether these could be useful:
- The
codeparameter could help categorize warnings- The
agentparameter could help identify the warning sourceIf intentionally simplified for testing purposes, add a comment explaining this decision.
Example enhancement if the additional context would be valuable:
+// WarningMessage represents a captured warning with its metadata. +type WarningMessage struct { + Code int + Agent string + Message string +} + type WarningCollector struct { sync.Mutex - messages []string + messages []WarningMessage } -func (w *WarningCollector) HandleWarningHeaderWithContext(_ context.Context, code int, _ string, message string) { +func (w *WarningCollector) HandleWarningHeaderWithContext(_ context.Context, code int, agent string, message string) { w.Lock() + w.messages = append(w.messages, WarningMessage{Code: code, Agent: agent, Message: message}) - w.messages = append(w.messages, message) w.Unlock() }Or if simplified intentionally:
+// HandleWarningHeaderWithContext implements rest.WarningHandlerWithContext. +// For test simplicity, only the message is captured; code and agent are ignored. func (w *WarningCollector) HandleWarningHeaderWithContext(_ context.Context, code int, _ string, message string) {
302-317: Consider adding input validation.The function doesn't validate that
cfgandschemeare non-nil. While the client creation will fail anyway, explicit validation would provide clearer error messages and follow defensive programming practices.func SetupClientWithWarningCollector(cfg *rest.Config, scheme *runtime.Scheme) (client.Client, *WarningCollector, error) { + if cfg == nil { + return nil, nil, fmt.Errorf("rest.Config cannot be nil") + } + if scheme == nil { + return nil, nil, fmt.Errorf("runtime.Scheme cannot be nil") + } + warnSink := &WarningCollector{} // copy to avoid mutating the passed-in config newcfg := rest.CopyConfig(cfg)pkg/controllers/machinesync/machine_sync_controller_test.go (1)
1432-1502: Consider using table-driven tests to reduce duplication.The three test cases (Synchronized=False, Unknown, and absent) have nearly identical structure with only the condition setup differing. This duplication makes the tests harder to maintain. Consider refactoring to a table-driven approach.
Example structure:
DescribeTable("warns the user when updating authoritativeAPI with bad synchronization state", func(setupCondition func(*mapiv1beta1.Machine)) { By("Setting up the machine condition state") Eventually(k.UpdateStatus(mapiMachine, func() { setupCondition(mapiMachine) })).Should(Succeed()) By("Attempting to update the authoritativeAPI should emit a warning") Eventually(func(g Gomega) { warnSink.Reset() err := warnKomega.Update(mapiMachine, func() { mapiMachine.Spec.AuthoritativeAPI = mapiv1beta1.MachineAuthorityClusterAPI })() g.Expect(err).NotTo(HaveOccurred()) g.Expect(warnSink.Messages()).To(ContainElement(ContainSubstring("updating .spec.authoritativeAPI when Synchronized=false on resource means changes may not take effect"))) }, timeout).Should(Succeed()) }, Entry("when Synchronized condition is False", func(m *mapiv1beta1.Machine) { m.Status.Conditions = []mapiv1beta1.Condition{{ Type: consts.SynchronizedCondition, Status: corev1.ConditionFalse, Reason: "ErrorReason", Message: "Error message", LastTransitionTime: metav1.Now(), }} }), Entry("when Synchronized condition is Unknown", func(m *mapiv1beta1.Machine) { m.Status.Conditions = []mapiv1beta1.Condition{{ Type: consts.SynchronizedCondition, Status: corev1.ConditionUnknown, Reason: "ErrorReason", Message: "Error message", LastTransitionTime: metav1.Now(), }} }), Entry("when machine has no Synchronized condition", func(m *mapiv1beta1.Machine) { m.Status.Conditions = []mapiv1beta1.Condition{} }), )
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (3)
manifests/0000_30_cluster-api_09_admission-policies.yaml(2 hunks)pkg/admissionpolicy/testutils/util.go(3 hunks)pkg/controllers/machinesync/machine_sync_controller_test.go(1 hunks)
🔇 Additional comments (1)
pkg/admissionpolicy/testutils/util.go (1)
269-300: LGTM! Thread-safe implementation is well done.The
WarningCollectorimplementation correctly usessync.Mutexfor thread safety, and theMessages()method properly returns a copy of the internal slice to prevent external mutation. This is a good defensive programming practice.
8dd295e to
2257890
Compare
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
|
@theobarberbany: This pull request references OCPCLOUD-2636 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
pkg/controllers/machinesync/machine_sync_controller_test.go (1)
1432-1502: LGTM! Thorough coverage of synchronization warning scenarios.The three test cases properly validate that warnings are emitted when updating
.spec.authoritativeAPIunder different non-synchronized states (False, Unknown, and absent condition). The use ofwarnSink.Reset()within eachEventuallyblock ensures test isolation.Note: The typo "machhine" at Line 1485 has already been flagged in previous review comments.
🧹 Nitpick comments (1)
manifests/0000_30_cluster-api_09_admission-policies.yaml (1)
307-307: Consider removing redundantvalidationActionsspecification.The
validationActions: [Warn]is specified in both the ValidatingAdmissionPolicy (Line 307) and the ValidatingAdmissionPolicyBinding (Lines 319-320). When specified in both places, the binding's value takes precedence. While not incorrect, removing one instance would simplify the configuration.Suggestion: Keep
validationActionsonly in the binding (Lines 319-320) since binding-level configuration overrides policy-level defaults and is more explicit about the intent.Also applies to: 319-320
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (3)
manifests/0000_30_cluster-api_09_admission-policies.yaml(2 hunks)pkg/admissionpolicy/testutils/util.go(3 hunks)pkg/controllers/machinesync/machine_sync_controller_test.go(1 hunks)
🔇 Additional comments (4)
pkg/admissionpolicy/testutils/util.go (2)
269-300: LGTM! Well-designed thread-safe warning collector.The implementation correctly handles concurrent access through proper mutex locking, and the defensive copy in
Messages()prevents external mutation of internal state. The unused parameters inHandleWarningHeaderWithContextare expected for interface compliance.
302-317: LGTM! Clean client setup with proper isolation.Copying the config before mutation (Line 306) prevents unexpected side effects, and the function signature provides everything needed for test scenarios to capture and assert on warnings.
pkg/controllers/machinesync/machine_sync_controller_test.go (1)
1372-1430: LGTM! Comprehensive test setup for warning validation.The BeforeEach block properly establishes the VAP infrastructure and confirms it's operational by creating a sentinel machine and verifying warning collection works before running the actual test cases.
manifests/0000_30_cluster-api_09_admission-policies.yaml (1)
273-307: LGTM! Validation logic correctly implements warning behavior.The policy properly warns users when they update
.spec.authoritativeAPIwhile the machine is not synchronized. The validation expression correctly triggers warnings in the three intended scenarios:
- No Synchronized condition exists
- Synchronized condition is False
- Synchronized condition is Unknown
The
failurePolicy: Ignore(Line 278) is appropriate for a warning-only policy.
|
@theobarberbany: This pull request references OCPCLOUD-2636 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (5)
pkg/controllers/machinesync/machine_sync_controller_test.go (2)
1455-1456: Align assertions with clearer warning text ("not True") and remove hardcoded "=false".Recommend adjusting expectations to a message that covers False, Unknown, or unset, matching the policy intent.
- g.Expect(warnSink.Messages()).To(ContainElement(ContainSubstring("updating .spec.authoritativeAPI when Synchronized=false on resource means changes may not take effect"))) + g.Expect(warnSink.Messages()).To(ContainElement(ContainSubstring("updating .spec.authoritativeAPI when Synchronized is not True on resource means changes may not take effect")))Also applies to: 1481-1482, 1500-1501
1446-1456: Optional: refresh object via warnClient before Update to avoid stale resourceVersion.Reduces flake risk when switching clients between status and spec updates.
By("Attempting to update the authoritativeAPI should emit a warning") Eventually(func(g Gomega) { warnSink.Reset() + // Ensure latest resourceVersion with the same client used for Update + g.Expect(warnClient.Get(ctx, client.ObjectKeyFromObject(mapiMachine), mapiMachine)).To(Succeed()) err := warnKomega.Update(mapiMachine, func() { mapiMachine.Spec.AuthoritativeAPI = mapiv1beta1.MachineAuthorityClusterAPI })() g.Expect(err).NotTo(HaveOccurred())Also applies to: 1472-1482, 1491-1501
pkg/admissionpolicy/testutils/util.go (1)
311-316: Avoid shadowing the importedclientpackage name.Rename local variable for readability.
- client, err := client.New(newcfg, client.Options{Scheme: scheme}) + cl, err := client.New(newcfg, client.Options{Scheme: scheme}) if err != nil { - return nil, nil, fmt.Errorf("error creating new client: %w", err) + return nil, nil, fmt.Errorf("error creating new client: %w", err) } - return client, warnSink, nil + return cl, warnSink, nilmanifests/0000_30_cluster-api_09_admission-policies.yaml (2)
303-308: Clarify warning message and drop redundant "Warning:" prefix.Covers False, Unknown, or unset; avoids implying strictly "=false".
- validations: - - expression: '!(( !variables.syncCond || variables.syncBad ) && variables.authAPIChanged)' - message: 'Warning: updating .spec.authoritativeAPI when Synchronized=false on resource means changes may not take effect' - validationActions: [Warn] + validations: + - expression: '!(( !variables.syncCond || variables.syncBad ) && variables.authAPIChanged)' + message: 'updating .spec.authoritativeAPI when Synchronized is not True on resource means changes may not take effect' + validationActions: [Warn]Note: update test expectations accordingly (see test file comments).
280-287: Optionally exclude service accounts to avoid noisy warnings from controllers.Align with
machine-api-machine-vapwhich filters SAs; keeps user-facing warnings focused.spec: failurePolicy: Ignore matchConstraints: resourceRules: - apiGroups: ["machine.openshift.io"] apiVersions: ["*"] operations: ["UPDATE"] resources: ["machines"] + matchConditions: + - name: check-only-non-service-account-requests + expression: >- + !(request.userInfo.username in [ + "system:serviceaccount:openshift-machine-api:machine-api-controllers", + "system:serviceaccount:openshift-cluster-api:cluster-capi-operator" + ])
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (3)
manifests/0000_30_cluster-api_09_admission-policies.yaml(2 hunks)pkg/admissionpolicy/testutils/util.go(3 hunks)pkg/controllers/machinesync/machine_sync_controller_test.go(1 hunks)
|
TODO: Invert logic to remove double negative, look at nits from coderabbit as some are good. |
2257890 to
2c59cbe
Compare
|
@theobarberbany: This pull request references OCPCLOUD-2636 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/override ci/prow/okd-scos-e2e-aws-ovn |
|
@theobarberbany: Overrode contexts on behalf of theobarberbany: ci/prow/okd-scos-e2e-aws-ovn In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
- Updates the VAP to emit a warning if the SynchronizedCondition is Unknown or unset. - Updates the test suite to capture warnings using a custom client.
2c59cbe to
b065cc9
Compare
|
@theobarberbany: This pull request references OCPCLOUD-2636 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
manifests/0000_30_cluster-api_09_admission-policies.yaml (1)
273-307: Consider simplifying the validation logic for readability.The validation logic is functionally correct but uses a double negative pattern (
!variables.syncBad) that could be clearer. Consider defining a positive variable instead:variables: - - name: syncCond - expression: > - has(object.status.conditions) ? - object.status.conditions.exists(c, c.type == "Synchronized") : - false - - name: syncBad + - name: syncTrue expression: > - variables.syncCond && - object.status.conditions.exists(c, - c.type == "Synchronized" && - (c.status == "False" || c.status == "Unknown") - ) + has(object.status.conditions) && + object.status.conditions.exists(c, + c.type == "Synchronized" && c.status == "True" + ) - name: authAPIChanged expression: > oldObject.spec.authoritativeAPI != object.spec.authoritativeAPI # All validations must evaluate to true validations: - - expression: '!variables.authAPIChanged || (variables.syncCond && !variables.syncBad)' + - expression: '!variables.authAPIChanged || variables.syncTrue'This removes the double negative and makes the intent clearer: "Warn unless authoritativeAPI is unchanged OR the Synchronized condition is True."
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (3)
manifests/0000_30_cluster-api_09_admission-policies.yaml(2 hunks)pkg/admissionpolicy/testutils/util.go(3 hunks)pkg/controllers/machinesync/machine_sync_controller_test.go(1 hunks)
🔇 Additional comments (8)
pkg/admissionpolicy/testutils/util.go (2)
269-301: LGTM! Clean and thread-safe implementation.The
WarningCollectoris well-designed for testing VAPs that emit warnings. The mutex protection ensures thread-safety, and returning a copy inMessages()prevents data races. The choice to capture only the message for test simplicity is pragmatic and well-documented.
303-318: LGTM! Defensive configuration handling.Good practice copying the config with
rest.CopyConfigto avoid mutating the caller's configuration. Error wrapping with%wpreserves the error chain for better debugging.pkg/controllers/machinesync/machine_sync_controller_test.go (4)
1372-1430: LGTM! Comprehensive test setup with VAP readiness verification.The sentinel machine verification (lines 1409-1428) is excellent practice—it confirms the VAP is active before running the actual tests, reducing false negatives from eventual consistency delays. The
warnSink.Reset()pattern keeps each probe self-contained.
1432-1457: LGTM! Test correctly verifies warning for Synchronized=False.The test properly sets the condition to False and confirms the expected warning is emitted. The self-contained probe pattern with
warnSink.Reset()ensures test isolation.
1458-1483: LGTM! Test correctly verifies warning for Synchronized=Unknown.Good coverage of the Unknown state. The test follows the same solid pattern as the False case, ensuring consistent behavior across different non-True condition states.
1485-1502: LGTM! Comprehensive coverage with missing condition test.Excellent edge case coverage for when the Synchronized condition is absent entirely. Together with the False and Unknown tests, this ensures the warning is emitted correctly across all states where changes might not take effect.
manifests/0000_30_cluster-api_09_admission-policies.yaml (2)
252-252: Minor whitespace adjustment.Blank line added for consistency.
308-319: LGTM! Binding correctly configured for warning behavior.The binding properly sets
validationActions: [Warn]to emit warnings rather than deny requests, and targets the correct namespace. The policy doesn't require parameters, so noparamRefis needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
Scheduling tests matching the |
|
@theobarberbany: Overrode contexts on behalf of theobarberbany: ci/prow/okd-scos-e2e-aws-ovn In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/retest |
|
/override ci/prow/e2e-openstack-ovn-techpreview enough signal from 3 runs, the job is just very flakey :( |
|
@theobarberbany: Overrode contexts on behalf of theobarberbany: ci/prow/e2e-openstack-ovn-techpreview In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@theobarberbany: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/approve |
|
@theobarberbany: This PR has been marked as verified by In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: theobarberbany The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Summary by CodeRabbit
New Features
Tests