-
Notifications
You must be signed in to change notification settings - Fork 49
OCPBUGS-63411: machinesync: Create Cluster API InfraMachine if not exists while Cluster API Machine exists #402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@chrischdi: This pull request references Jira Issue OCPBUGS-63411, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughReconcile now checks whether a Cluster API infrastructure Machine exists before choosing sync direction, adds private helpers to fetch and detect the infra Machine, refactors infra resource fetching to separate infraCluster and infraMachine retrieval, and updates tests to expect a successful CAPI→MAPI synchronization path when the infra Machine is present. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant R as Reconciler
participant Exists as doesCAPIInfraMachineExist()
participant Fetch as fetchCAPIInfraMachine()
participant CAPI as CAPIMachine
participant MAPI as MAPIMachine
participant Infra as InfraMachine
R->>Exists: check infra existence (capiMachine, mapiMachine)
Exists->>CAPI: read infrastructureRef from CAPIMachine
alt infraRef present
Exists->>Infra: get InfraMachine by ref (ns/name)
else fallback to MAPI ref
Exists->>Infra: get InfraMachine by MAPI name/namespace
end
Infra-->>Exists: found / not found
Exists-->>R: return (bool, error)
alt infra NOT found and tie-breaker favors MAPI
R->>Fetch: fetchCAPIInfraMachine(...) to init & retrieve infra
Fetch->>Infra: init & get InfraMachine
Infra-->>Fetch: infra object / error
Fetch-->>R: infra machine returned
R->>R: proceed MAPI → CAPIMachine reconciliation (create/sync)
else
R->>R: follow CAPI→MAPI authoritative path or no-op
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.5.0)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
pkg/controllers/machinesync/machine_sync_controller.go (1)
246-246: Fix typo in comment."existense" should be "existence".
Apply this diff:
- // Check for existense of the Cluster API Infrastructure Machine or if it needs to get created from MAPI first. + // Check for existence of the Cluster API Infrastructure Machine or if it needs to get created from MAPI first.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (2)
pkg/controllers/machinesync/machine_sync_controller.go(4 hunks)pkg/controllers/machinesync/machine_sync_controller_test.go(1 hunks)
🔇 Additional comments (5)
pkg/controllers/machinesync/machine_sync_controller_test.go (1)
782-795: LGTM! Test correctly validates the new InfraMachine creation behavior.The test expectations have been properly updated to reflect the new functionality where the Cluster API InfraMachine is now created successfully when it doesn't exist, rather than failing. The test now:
- Expects successful synchronization (Status=True, Reason=ResourceSynchronized)
- Verifies the InfraMachine is actually created
This aligns well with the implementation changes.
pkg/controllers/machinesync/machine_sync_controller.go (4)
258-262: Good timestamp-based precedence logic.The condition correctly creates the InfraMachine from MAPI only when:
- Cluster API is authoritative
- CAPI Machine exists but InfraMachine doesn't
- MAPI machine was created first (or simultaneously)
The timestamp comparison (
mapiMachine.CreationTimestamp.Equal(&capiMachine.CreationTimestamp) || mapiMachine.CreationTimestamp.Before(&capiMachine.CreationTimestamp)) ensures MAPI-originated configuration takes precedence only when appropriate, preventing overwriting of potentially newer CAPI-originated state.
950-974: Clean refactoring that improves modularity.The refactored
fetchCAPIInfraResourcesnow properly separates InfraCluster and InfraMachine retrieval, delegating the latter to the dedicatedfetchCAPIInfraMachinehelper. This improves code reusability and maintains clear separation of concerns.
976-994: Well-implemented helper function.The
fetchCAPIInfraMachinehelper is cleanly designed with:
- Focused responsibility (fetch infrastructure machine by name/namespace)
- Proper NotFound handling (returns nil rather than error)
- Appropriate error wrapping for clarity
- Platform-agnostic initialization
996-1012: Solid existence check implementation with proper nil handling.The
doesCAPIInfraMachineExisthelper correctly:
- Handles nil
capiMachineby falling back tomapiMachine.Name- Uses
capiMachine.Spec.InfrastructureRefwhen available for accurate lookup- Returns a clear boolean result rather than requiring nil checks
- Wraps errors appropriately for context
|
/jira refresh |
|
@chrischdi: This pull request references Jira Issue OCPBUGS-63411, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/lgtm |
|
Scheduling tests matching the |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: nrb The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Hi @chrischdi I tried to test this, follow below steps:
result: FailedToConvertCAPIMachineToMAPI, awsmachine was not created. |
|
/hold For checking issue |
|
I think this is the expected behavior for the code: I did it this way assuming the scenarios are:
Not sure if it is right to "just create" the InfraMachine from the MAPI Machine in case the CAPI Machine was created first. WDYT @damdo? |
|
/retest |
|
Looks like Have we impacted the deletion logic with this change? |
Fun... yeah that could actually be true. Have to check on this. |
Happy to pair some time this week if you fancy :) |
ebbcf2d to
debacfc
Compare
|
New changes are detected. LGTM label has been removed. |
|
@chrischdi: This pull request references Jira Issue OCPBUGS-63411, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
debacfc to
566c680
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
pkg/controllers/machinesync/machine_sync_controller.go (1)
269-273: Consider simplifying the timestamp comparison.The logic correctly handles the scenario where the MAPI Machine is created with
authoritativeAPI: ClusterAPIand needs to create the missing InfraMachine.However, the timestamp comparison could be simplified:
- if !capiInfraMachineExists && (mapiMachine.CreationTimestamp.Equal(&capiMachine.CreationTimestamp) || mapiMachine.CreationTimestamp.Before(&capiMachine.CreationTimestamp)) { + if !capiInfraMachineExists && !mapiMachine.CreationTimestamp.After(&capiMachine.CreationTimestamp) { return r.reconcileMAPIMachinetoCAPIMachine(ctx, mapiMachine, capiMachine) }The negative form (
!After) is logically equivalent to (Equal || Before) and is more concise.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (2)
pkg/controllers/machinesync/machine_sync_controller.go(5 hunks)pkg/controllers/machinesync/machine_sync_controller_test.go(1 hunks)
🔇 Additional comments (6)
pkg/controllers/machinesync/machine_sync_controller.go (5)
194-194: LGTM! Appropriate linting suppression.The addition of
cyclopto the nolint directive is reasonable given the inherent complexity of the reconciliation logic with multiple conditional paths.
201-201: Good defensive check for InfraMachine existence.The early existence check for the CAPI InfraMachine allows the controller to make informed decisions about reconciliation direction in the subsequent switch statement. Error handling is appropriate.
Also applies to: 257-262
996-1037: Good refactoring for improved code organization.The refactored
fetchCAPIInfraResourcesfunction now properly delegates the InfraMachine fetching to a dedicated helper (fetchCAPIInfraMachine), improving code organization and reusability. Error handling and validation logic are preserved correctly.
1039-1057: Well-designed helper function for fetching InfraMachine.The
fetchCAPIInfraMachinehelper appropriately separates the concern of fetching the infrastructure machine. The decision to returnnilforNotFounderrors (rather than propagating the error) is correct, as the absence of an InfraMachine is a valid state that callers need to handle explicitly.
1059-1075: Robust existence check with appropriate fallback logic.The
doesCAPIInfraMachineExisthelper correctly handles both scenarios:
- When the CAPI Machine exists - uses its
infrastructureReffor the lookup- When the CAPI Machine doesn't exist - falls back to using the MAPI Machine's name
This fallback is necessary because the helper is called before the switch statement that determines reconciliation paths. Error handling is appropriate.
pkg/controllers/machinesync/machine_sync_controller_test.go (1)
952-965: Test correctly validates the new InfraMachine creation behavior.The test has been appropriately updated to verify the fix:
- Expects successful synchronization (
ResourceSynchronized = True) instead of an error condition- Explicitly verifies that the Cluster API InfraMachine is created
The test setup correctly simulates the scenario where:
- MAPI Machine is created first with
authoritativeAPI: ClusterAPI(line 936-941)- CAPI Machine exists (line 948)
- InfraMachine does NOT exist
This matches the timestamp-based condition added in the controller (line 270), where MAPI's earlier creation timestamp triggers the MAPI→CAPI reconciliation path to create the missing InfraMachine.
566c680 to
733a848
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (2)
pkg/controllers/machinesync/machine_sync_controller.go(5 hunks)pkg/controllers/machinesync/machine_sync_controller_test.go(1 hunks)
🔇 Additional comments (4)
pkg/controllers/machinesync/machine_sync_controller_test.go (1)
952-965: LGTM! Test correctly validates InfraMachine creation.The test expectations have been properly updated to validate the new behavior where the InfraMachine is created when a CAPI Machine exists. The test now:
- Expects a successful synchronization condition (True with ResourceSynchronized)
- Constructs and verifies the InfraMachine can be retrieved
This aligns well with the PR objective to ensure infra-machine creation during synchronization.
pkg/controllers/machinesync/machine_sync_controller.go (3)
997-1038: Well-structured refactoring of infra resource fetching.The refactoring improves code organization by:
- Separating infraCluster fetching from infraMachine fetching
- Delegating infraMachine retrieval to the new
fetchCAPIInfraMachinehelper- Maintaining proper error handling and nil checks
This makes the code more maintainable and enables reuse of the infraMachine fetching logic.
1040-1076: New helper functions are well-designed.The two new helper functions provide clean abstractions:
fetchCAPIInfraMachine(lines 1040-1058): Fetches the InfraMachine and returns nil if not found, properly handling errors.
doesCAPIInfraMachineExist(lines 1060-1076): Provides a boolean check for InfraMachine existence with appropriate fallback logic for name/namespace resolution.The functions are appropriately scoped to the reconciler and handle edge cases well.
269-273: The timestamp comparison logic is correct and intentionally prevents stale MAPI from overriding CAPI state.The condition
!mapiMachine.CreationTimestamp.After(&capiMachine.CreationTimestamp)only creates an InfraMachine from MAPI if MAPI existed before or at the same time as CAPI. When a MAPI machine is created after CAPI (the edge case raised in the review), the condition fails and reconciliation falls through toreconcileCAPIMachinetoMAPIMachineat line 275, syncing from CAPI to MAPI instead.This is the correct behavior: since
status.authoritativeAPI = ClusterAPIdesignates CAPI as the source of truth, a MAPI machine created afterward should not override CAPI's state by recreating infrastructure resources. The sync flows from the authoritative source (CAPI) to the mirror (MAPI).
733a848 to
10f80ea
Compare
10f80ea to
f10823d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
pkg/controllers/machinesync/machine_sync_controller.go (3)
1013-1022: Be tolerant of missing infraRef.Namespace; default to Machine namespace and treat empty name as “not exists” in existence checks.CAPI specs commonly omit InfrastructureRef.Namespace (same-namespace reference). Current validation turns this into a terminal error and the existence check can error on empty fields. Default the namespace, and in doesCAPIInfraMachineExist() return false (no error) when the ref is incomplete.
@@ - infraMachineRef := capiMachine.Spec.InfrastructureRef + infraMachineRef := capiMachine.Spec.InfrastructureRef @@ - if infraMachineRef.Name == "" || infraMachineRef.Namespace == "" { - return nil, nil, fmt.Errorf("machine %s/%s: %w", - capiMachine.Namespace, capiMachine.Name, errInvalidInfraMachineReference) - } + if infraMachineRef.Name == "" { + return nil, nil, fmt.Errorf("machine %s/%s: %w", + capiMachine.Namespace, capiMachine.Name, errInvalidInfraMachineReference) + } + // Default to the machine namespace when infraRef namespace is omitted. + infraNamespace := infraMachineRef.Namespace + if infraNamespace == "" { + infraNamespace = capiMachine.Namespace + } @@ - infraMachine, err := r.fetchCAPIInfraMachine(ctx, infraMachineRef.Name, infraMachineRef.Namespace) + infraMachine, err := r.fetchCAPIInfraMachine(ctx, infraMachineRef.Name, infraNamespace) @@ func (r *MachineSyncReconciler) doesCAPIInfraMachineExist(ctx context.Context, capiMachine *clusterv1.Machine, mapiMachine *mapiv1beta1.Machine) (bool, error) { - namespace := r.CAPINamespace - name := mapiMachine.Name - - if capiMachine != nil { - name = capiMachine.Spec.InfrastructureRef.Name - namespace = capiMachine.Spec.InfrastructureRef.Namespace - } + namespace := r.CAPINamespace + name := mapiMachine.Name + if capiMachine != nil { + ref := capiMachine.Spec.InfrastructureRef + if ref.Name == "" { + // Treat incomplete ref as “not exists” rather than error. + return false, nil + } + name = ref.Name + if ref.Namespace != "" { + namespace = ref.Namespace + } else { + namespace = capiMachine.Namespace + } + }Also applies to: 1033-1039, 1061-1077
1101-1108: Unconditional error returns after RemoveFinalizer; always errors even on success.Both branches return fmt.Errorf(...) regardless of err. This breaks deletion flows and can cause CI failures.
@@ - _, err := util.RemoveFinalizer(ctx, r.Client, mapiMachine, SyncFinalizer) - - return false, fmt.Errorf("failed to remove finalizer: %w", err) + changed, err := util.RemoveFinalizer(ctx, r.Client, mapiMachine, SyncFinalizer) + if err != nil { + return false, fmt.Errorf("failed to remove finalizer: %w", err) + } + return changed, nil @@ - _, err := util.RemoveFinalizer(ctx, r.Client, capiMachine, SyncFinalizer) - - return false, fmt.Errorf("failed to remove finalizer: %w", err) + changed, err := util.RemoveFinalizer(ctx, r.Client, capiMachine, SyncFinalizer) + if err != nil { + return false, fmt.Errorf("failed to remove finalizer: %w", err) + } + return changed, nilAlso applies to: 1221-1228
235-235: Minor: double colon in error message.Typo in format string.
- return ctrl.Result{}, fmt.Errorf("failed to get Cluster API machine:: %w", err) + return ctrl.Result{}, fmt.Errorf("failed to get Cluster API machine: %w", err)
🧹 Nitpick comments (1)
pkg/controllers/machinesync/machine_sync_controller_unit_test.go (1)
81-92: Add coverage for omitted infraRef.Namespace and existence check edge.Include a case where capiMachine.Spec.InfrastructureRef.Namespace is empty (default to machine namespace) and verify fetch succeeds; also add a unit for doesCAPIInfraMachineExist handling incomplete refs.
@@ Context("and Infrastructure is present", func() { var objs []client.Object BeforeEach(func() { objs = []client.Object{ @@ }) @@ It("should return infrastructure machine if it is present", func() { @@ Expect(infraMachine).ToNot(BeNil()) }) + + It("should default infraRef namespace to machine namespace when omitted", func() { + // Omit namespace in infraRef + capiMachine.Spec.InfrastructureRef.Namespace = "" + objs = append(objs, &awsv1.AWSCluster{ + ObjectMeta: metav1.ObjectMeta{ + Name: capiMachine.Spec.ClusterName, + Namespace: capiMachine.Namespace, + }, + }, &awsv1.AWSMachine{ + ObjectMeta: metav1.ObjectMeta{ + Name: capiMachine.Name, + Namespace: capiMachine.Namespace, // same as machine + }, + }) + reconciler.Client = fake.NewClientBuilder().WithScheme(testEnv.Scheme).WithObjects(objs...).Build() + infraCluster, infraMachine, err := reconciler.fetchCAPIInfraResources(ctx, capiMachine) + Expect(err).ToNot(HaveOccurred()) + Expect(infraCluster).ToNot(BeNil()) + Expect(infraMachine).ToNot(BeNil()) + })Optionally, add a focused unit for doesCAPIInfraMachineExist() where infraRef.Name is empty and expect (false, nil).
Also applies to: 94-101, 103-116
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (3)
pkg/controllers/machinesync/machine_sync_controller.go(5 hunks)pkg/controllers/machinesync/machine_sync_controller_test.go(1 hunks)pkg/controllers/machinesync/machine_sync_controller_unit_test.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/controllers/machinesync/machine_sync_controller_test.go
|
@chrischdi: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary by CodeRabbit
Improvements
Tests