Skip to content

Conversation

@miyadav
Copy link
Member

@miyadav miyadav commented Sep 8, 2025

@huali9 @damdo @theobarberbany , thanks for already reviewing this earlier ( #343) , it became too long , hence made a new one , changes are same.

Summary by CodeRabbit

  • New Features

    • None
  • Tests

    • Added e2e machine-migration tests verifying that non-authoritative fields and protected annotations cannot be modified, ensuring mirrored-machine presence and admission rejections for disallowed updates.
  • Chores

    • Updated vendoring and dependency list; configured e2e to prefer a local operator replacement for builds.

@openshift-ci-robot
Copy link

Pipeline controller notification
This repository is configured to use the pipeline controller. Second-stage tests will be triggered only if the required tests of the first stage are successful. The pipeline controller will automatically detect which contexts are required, or not needed and will utilize a set of /test and /override Prow commands to trigger the second stage.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 8, 2025

@miyadav: This pull request references OCPCLOUD-2566 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target either version "4.21." or "openshift-4.21.", but it targets "openshift-4.20" instead.

In response to this:

@huali9 @damdo @theobarberbany , thanks for already reviewing this earlier ( #343) , it became too long , hence made a new one , changes are same.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Sep 8, 2025
@openshift-ci openshift-ci bot requested review from damdo and mdbooth September 8, 2025 05:03
@openshift-ci-robot
Copy link

/test remaining-required

Scheduling tests matching the pipeline_run_if_changed parameter:
/test e2e-aws-capi-techpreview
/test e2e-aws-ovn
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-aws-ovn-techpreview
/test e2e-aws-ovn-techpreview-upgrade
/test e2e-azure-capi-techpreview
/test e2e-azure-ovn-techpreview
/test e2e-azure-ovn-techpreview-upgrade
/test e2e-gcp-capi-techpreview
/test e2e-gcp-ovn-techpreview
/test e2e-metal3-capi-techpreview
/test e2e-openstack-capi-techpreview
/test e2e-openstack-ovn-techpreview
/test e2e-vsphere-capi-techpreview
/test regression-clusterinfra-aws-ipi-techpreview-capi

@miyadav
Copy link
Member Author

miyadav commented Sep 8, 2025

/test e2e-metal3-capi-techpreview

@miyadav
Copy link
Member Author

miyadav commented Sep 8, 2025

/test e2e-vsphere-capi-techpreview

@miyadav
Copy link
Member Author

miyadav commented Sep 8, 2025

/test e2e-metal3-capi-techpreview

@damdo
Copy link
Member

damdo commented Sep 8, 2025

/assign @theobarberbany @racheljpg

@miyadav
Copy link
Member Author

miyadav commented Sep 8, 2025

/retest

@theobarberbany
Copy link
Contributor

theobarberbany commented Sep 8, 2025

Can we look at using go workspaces, rather than vendoring ccapio in our e2e module.

This pattern has proven to be pauinful, e.g with cloud-provider-gcp. e.g:

  • we can easily forget to bump e2e , so our e2es dont test up to date code
  • we can't e2e pre merge, as we can't do a go get on unmerged code.

openshift/cloud-provider-gcp#91 is an example of how we can do this sort of change.

https://go.dev/doc/tutorial/workspaces is a good tutorial to get the concepts down :)

@theobarberbany
Copy link
Contributor

@miyadav can we also make this a draft until we're ready to test :)

@miyadav miyadav marked this pull request as draft September 8, 2025 10:38
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 8, 2025
@miyadav
Copy link
Member Author

miyadav commented Sep 8, 2025

Can we look at using go workspaces, rather than vendoring ccapio in our e2e module.

This pattern has proven to be pauinful, e.g with cloud-provider-gcp. e.g:

  • we can easily forget to bump e2e , so our e2es dont test up to date code
  • we can't e2e pre merge, as we can't do a go get on unmerged code.

openshift/cloud-provider-gcp#91 is an example of how we can do this sort of change.

https://go.dev/doc/tutorial/workspaces is a good tutorial to get the concepts down :)

when i was trying this was getting weird errors related to baremetal imports , I am not sure why , will review these again and see what could have been missed.

@miyadav
Copy link
Member Author

miyadav commented Sep 8, 2025

/test e2e-metal3-capi-techpreview

1 similar comment
@miyadav
Copy link
Member Author

miyadav commented Sep 8, 2025

/test e2e-metal3-capi-techpreview

@miyadav
Copy link
Member Author

miyadav commented Sep 8, 2025

/test e2e-aws-capi-techpreview

2 similar comments
@miyadav
Copy link
Member Author

miyadav commented Sep 8, 2025

/test e2e-aws-capi-techpreview

@miyadav
Copy link
Member Author

miyadav commented Sep 8, 2025

/test e2e-aws-capi-techpreview

@miyadav
Copy link
Member Author

miyadav commented Sep 8, 2025

/test e2e-metal3-capi-techpreview

@theobarberbany
Copy link
Contributor

when i was trying this was getting weird errors related to baremetal imports , I am not sure why , will review these again and see what could have been missed.

Ah I see - if you share those errors, maybe we can help?

Also I think given this now is a 5000+ line change, we should try to have a separate PR for getting go workspaces configured for e2es, to avoid making the repository history overly complex :)

@miyadav
Copy link
Member Author

miyadav commented Sep 8, 2025

made changes to makefile now , GOWORK=off for earlier failure
@theobarberbany PTAL when time permits , if these are safe changes to makefile

@miyadav
Copy link
Member Author

miyadav commented Sep 9, 2025

when i was trying this was getting weird errors related to baremetal imports , I am not sure why , will review these again and see what could have been missed.

Ah I see - if you share those errors, maybe we can help?

Also I think given this now is a 5000+ line change, we should try to have a separate PR for getting go workspaces configured for e2es, to avoid making the repository history overly complex :)

I have made another PR , once that is reviewed and merged , will rebase this one with it and remove extra complexities we added here. Thanks.

Comment on lines 272 to 288
Eventually(func() error {
// Get the latest version again right before update
currentMachine, err := mapiframework.GetMachine(cl, testMachineName)
if err != nil {
return err
}

updatedMachine := currentMachine.DeepCopy()
updatedMachine.Spec.AuthoritativeAPI = machinev1beta1.MachineAuthorityMachineAPI

err = cl.Update(ctx, updatedMachine)
return err
}, capiframework.WaitMedium, capiframework.RetryMedium).Should(Succeed(), "Should allow modification of authoritativeAPI")

// Verify the machine still exists after the change (don't wait for specific status)
_, err = mapiframework.GetMachine(cl, testMachineName)
Expect(err).ToNot(HaveOccurred(), "Machine should still exist after authoritativeAPI change")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use Komega here?

Suggested change
Eventually(func() error {
// Get the latest version again right before update
currentMachine, err := mapiframework.GetMachine(cl, testMachineName)
if err != nil {
return err
}
updatedMachine := currentMachine.DeepCopy()
updatedMachine.Spec.AuthoritativeAPI = machinev1beta1.MachineAuthorityMachineAPI
err = cl.Update(ctx, updatedMachine)
return err
}, capiframework.WaitMedium, capiframework.RetryMedium).Should(Succeed(), "Should allow modification of authoritativeAPI")
// Verify the machine still exists after the change (don't wait for specific status)
_, err = mapiframework.GetMachine(cl, testMachineName)
Expect(err).ToNot(HaveOccurred(), "Machine should still exist after authoritativeAPI change")
Eventually(k.Update(testMachine, func(){
testMachine.Spec.AuthoritativeAPI = machinev1beta1.MachineAuthorityMachineAPI
}), capiframework.WaitMedium, capiframework.RetryMedium).Should(Succeed(), "Should allow modification of authoritativeAPI")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Theo using this , need to updated machinev1beta1.MachineAuthorityMachineAPI to machinev1beta1.MachineAuthorityClusterAPI

return err
}, capiframework.WaitMedium, capiframework.RetryMedium).Should(Succeed(), "Should allow modification of authoritativeAPI")

// Verify the machine still exists after the change (don't wait for specific status)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we doing this? I'm not following :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is for ginkgo retry loop in Eventually and other one is after the retries succeed, just to confirm machine update succeeds.

Comment on lines 294 to 296
if testMachine != nil {
mapiframework.DeleteMachines(ctx, cl, testMachine)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason we're not using cleanupMachineResources( as we do below?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test was failing , for cleaning up , some conflict due to machine already in deleted status .. , to check that , just deleted mapi and it was cleaning up properly, we shall use cleanupMachineResources , let me check if that is some bug , or was just test problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the above context only using one testMachine, hence cannot use the cleanresources as is , which is used below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I'm following here 😓 can we see some logs? My reading of cleanupMachineResources shows that we're basically running the same code here and in the function..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the cleanupresources taking args as capi and mapi machines) - , where as in here , we just check that we can modify authoritativeAPI field and then immediately clean up the mapi machine that is updated , so not needed to explicitly try to delete capimachinename as well , which is why simply deleting the updated mapi machine only.
cleanupMachineResources( ctx, cl, []*clusterv1.Machine{testCapiMachine}, []*machinev1beta1.Machine{testMapiMachine}, )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just example , it takes like 13s to verify and delete it .

`miyadav@miyadav-thinkpadx1carbongen8:~$ oc get machine.c -w 
NAME                                           CLUSTER                NODENAME   PROVIDERID                              PHASE      AGE   VERSION
machine-auth-change-83955                      miyadav-2209v1-gfzxc                                                      Deleting   13s   
`

Comment on lines 717 to 744
func verifyFieldModificationPrevented(cl client.Client, machineName string, fieldName string, modifyFunc func(*machinev1beta1.Machine)) {
By(fmt.Sprintf("Attempting to modify %s in MAPI Machine", fieldName))

currentMachine, err := mapiframework.GetMachine(cl, machineName)
Expect(err).NotTo(HaveOccurred(), "Failed to get current machine")
updatedMachine := currentMachine.DeepCopy()
modifyFunc(updatedMachine)

err = cl.Update(ctx, updatedMachine)
Expect(err).To(HaveOccurred())
Expect(err.Error()).To(ContainSubstring("ValidatingAdmissionPolicy 'machine-api-machine-vap' with binding 'machine-api-machine-vap' denied request"))
}
func verifyAnnotationModificationPrevented(cl client.Client, machineName string) {
By("Attempting to add annotations machine.openshift.io. to MAPI Machine")

currentMachine, err := mapiframework.GetMachine(cl, machineName)
Expect(err).NotTo(HaveOccurred(), "Failed to get current machine")
updatedMachine := currentMachine.DeepCopy()
if updatedMachine.Annotations == nil {
updatedMachine.Annotations = make(map[string]string)
}
updatedMachine.Annotations[testAnnotationKey] = testAnnotationValue

err = cl.Update(ctx, updatedMachine)
Expect(err).To(HaveOccurred())
Expect(err.Error()).To(ContainSubstring("ValidatingAdmissionPolicy 'machine-api-machine-vap' with binding 'machine-api-machine-vap' denied request"))
Expect(err.Error()).To(ContainSubstring("Cannot add, modify or delete any machine.openshift.io/* annotation"))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These could be made more generic, something like verifyUpdatePrevented that takes an updateFunc in the same way... we could also provide parameters for the expected errors.

verifyAnnotationModificationPrevented then becomes a call to verifyUpdatePrevented

Make sense?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly , why I left them separate , is because i was not sure , how much more such changes ( from testing ) we can expect in the future for annotations and fields , so if we could keep them separate it would be easier if any one need customization later . Let me try if it doesn't make much difference even if it becomes same for both .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got you now , so if we are calling verifyUpdatePrevented in the others , it would be straightforward to use and will be generic .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep! keeping our helpers composable will make our lives easier in future :)

providerSpec.SecurityGroups = append(providerSpec.SecurityGroups, machinev1beta1.AWSResourceReference{
ID: ptr.To(testSecurityGroupID),
})
})).To(Succeed(), "Failed to update SecurityGroups in provider spec")
Copy link
Contributor

@theobarberbany theobarberbany Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these messages don't make sense: we're expecting a field update to be prevented, so in this case a failure would be suceeding to update a field.

I think something like "Expected failure attempting to update SecurityGroups in provider spec" makes more sense.

1.Style nit - passing function directly to Expect instead of catching error as seperate line of code
2.Use Komega , updated a logical error
3.Used seperate mapimachineclean up and not cleanupresources and the mapi machine was only needed(testMachine)
4.Created verifyUpdatePrevented to be generic function called in annotation and field functions
5.Assertion messages were misleading updated them
Changes suggested by: [email protected]
@theobarberbany
Copy link
Contributor

@coderabbitai full review

@theobarberbany
Copy link
Contributor

@coderabbitai review

@miyadav
Copy link
Member Author

miyadav commented Sep 19, 2025

/retest

Comment on lines 705 to 708
currentMachine, err := mapiframework.GetMachine(cl, machineName)
Expect(err).NotTo(HaveOccurred(), "Failed to get current machine")
updatedMachine := currentMachine.DeepCopy()
updateFunc(updatedMachine)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all of this could be just passing the updatefunc and machine to a k.Update, and we just pass the machine in question (rather than always passing machine.Name)

					Eventually(k.Update(machine, updateFunc), timeout).Should(MatchError(ContainSubstring(expectedErrorSubstrings)))

and we move the logic for working out what expectedErrorSubstrings should be to above the call.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me add this .

Takes the machine object directly (no need to fetch by name)
Uses komega.Update for consistent async behavior
Accepts variable error substrings for flexible validation
@theobarberbany
Copy link
Contributor

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Sep 22, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 22, 2025

@miyadav: This pull request references OCPCLOUD-2807 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

This pull request references OCPCLOUD-2566 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target either version "4.21." or "openshift-4.21.", but it targets "openshift-4.20" instead.

In response to this:

@huali9 @damdo @theobarberbany , thanks for already reviewing this earlier ( #343) , it became too long , hence made a new one , changes are same.

Summary by CodeRabbit

  • New Features

  • None

  • Tests

  • Added end-to-end machine migration tests covering authoritative vs non-authoritative updates, immutability of protected fields, annotation protections, mirrored machine existence, and helper utilities for provider-spec updates and policy assertions.

  • Chores

  • Updated vendoring and dependencies (added/removed indirect deps) and prefer local operator during e2e builds by wiring a local replacement.

  • Documentation

  • None

  • Refactor

  • None

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between e404940 and f0cd053.

📒 Files selected for processing (1)
  • e2e/machine_migration_test.go (5 hunks)
🔇 Additional comments (9)
e2e/machine_migration_test.go (9)

5-5: LGTM! Necessary imports and utilities added.

The imports for encoding/json, mapi2capi conversion utilities, and ptr helper are appropriate for the new test functionality involving provider spec manipulation and JSON marshaling.

Also applies to: 15-15, 26-27


29-43: LGTM! Well-structured test constants.

The constants are appropriately scoped with descriptive names and realistic test values that align with AWS provider configuration requirements.


286-415: LGTM! Comprehensive VAP tests with good coverage.

The new test context provides excellent coverage for Validating Admission Policy enforcement on non-authoritative MAPI machines. The tests systematically verify that modifications to various AWS provider configuration fields are properly denied, which aligns with the expected behavior when ClusterAPI has authority.


685-700: LGTM! Well-structured provider spec update utility.

The function properly extracts the AWS provider spec, applies the update function, marshals the result back to JSON, and updates the machine's ProviderSpec. The error handling is appropriate.


701-714: LGTM! Generic verification utility with good defaults.

The function provides a clean abstraction for testing update prevention with sensible default error messages for VAP denials.


716-718: LGTM! Simple and focused helper function.

The function provides a clean wrapper for field modification testing.


720-727: LGTM! Appropriate annotation-specific test.

The function correctly tests annotation modification prevention with specific error messages for the machine.openshift.io namespace, which is expected behavior for VAP enforcement.


728-731: LGTM! Simple existence verification utility.

The function provides a straightforward verification that a CAPI machine exists.


733-746: LGTM! Proper provider spec extraction with reused utilities.

The function correctly leverages the existing mapi2capi.AWSProviderSpecFromRawExtension utility to avoid code duplication, with appropriate error handling for edge cases.

Comment on lines 260 to 285
var _ = Describe("Update MAPI Machine", Ordered, func() {
Context("with spec.authoritativeAPI modification", func() {
It("should allow modification of authoritativeAPI from ClusterAPI to MachineAPI", func() {
By("Attempting to modify authoritativeAPI from ClusterAPI to MachineAPI")

// Create a new machine for this test to avoid affecting other tests
testMachineName := "machine-auth-change-83955"
testMachine := createMAPIMachineWithAuthority(ctx, cl, testMachineName, machinev1beta1.MachineAuthorityClusterAPI)

_, err := capiframework.GetMachine(cl, testMachineName, capiframework.CAPINamespace)
Expect(err).ToNot(HaveOccurred(), "CAPI Machine should be created")
// Ensure any pending updates are processed
Eventually(komega.Update(testMachine, func() {
testMachine.Spec.AuthoritativeAPI = machinev1beta1.MachineAuthorityClusterAPI
}), capiframework.WaitMedium, capiframework.RetryMedium).Should(Succeed(), "Should allow modification of authoritativeAPI")

// Clean up the test machine
DeferCleanup(func() {
By("Cleaning up test machine")
// Delete the MAPI machine
if testMachine != nil {
mapiframework.DeleteMachines(ctx, cl, testMachine)
}
})
})
})
Copy link

@coderabbitai coderabbitai bot Sep 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

**Fix the authoritativeAPI test logic.**The test is attempting to modify the authoritativeAPI from ClusterAPI to ClusterAPI (same value), but the comment and test description indicate it should be testing a change from ClusterAPI to MachineAPI. The current logic doesn't match the test intention.

Apply this diff to fix the test logic:

-				// Ensure any pending updates are processed
-				Eventually(komega.Update(testMachine, func() {
-					testMachine.Spec.AuthoritativeAPI = machinev1beta1.MachineAuthorityClusterAPI
-				}), capiframework.WaitMedium, capiframework.RetryMedium).Should(Succeed(), "Should allow modification of authoritativeAPI")
+				// Change authoritativeAPI from ClusterAPI to MachineAPI
+				Eventually(komega.Update(testMachine, func() {
+					testMachine.Spec.AuthoritativeAPI = machinev1beta1.MachineAuthorityMachineAPI
+				}), capiframework.WaitMedium, capiframework.RetryMedium).Should(Succeed(), "Should allow modification of authoritativeAPI from ClusterAPI to MachineAPI")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
var _ = Describe("Update MAPI Machine", Ordered, func() {
Context("with spec.authoritativeAPI modification", func() {
It("should allow modification of authoritativeAPI from ClusterAPI to MachineAPI", func() {
By("Attempting to modify authoritativeAPI from ClusterAPI to MachineAPI")
// Create a new machine for this test to avoid affecting other tests
testMachineName := "machine-auth-change-83955"
testMachine := createMAPIMachineWithAuthority(ctx, cl, testMachineName, machinev1beta1.MachineAuthorityClusterAPI)
_, err := capiframework.GetMachine(cl, testMachineName, capiframework.CAPINamespace)
Expect(err).ToNot(HaveOccurred(), "CAPI Machine should be created")
// Ensure any pending updates are processed
Eventually(komega.Update(testMachine, func() {
testMachine.Spec.AuthoritativeAPI = machinev1beta1.MachineAuthorityClusterAPI
}), capiframework.WaitMedium, capiframework.RetryMedium).Should(Succeed(), "Should allow modification of authoritativeAPI")
// Clean up the test machine
DeferCleanup(func() {
By("Cleaning up test machine")
// Delete the MAPI machine
if testMachine != nil {
mapiframework.DeleteMachines(ctx, cl, testMachine)
}
})
})
})
var _ = Describe("Update MAPI Machine", Ordered, func() {
Context("with spec.authoritativeAPI modification", func() {
It("should allow modification of authoritativeAPI from ClusterAPI to MachineAPI", func() {
By("Attempting to modify authoritativeAPI from ClusterAPI to MachineAPI")
// Create a new machine for this test to avoid affecting other tests
testMachineName := "machine-auth-change-83955"
testMachine := createMAPIMachineWithAuthority(ctx, cl, testMachineName, machinev1beta1.MachineAuthorityClusterAPI)
_, err := capiframework.GetMachine(cl, testMachineName, capiframework.CAPINamespace)
Expect(err).ToNot(HaveOccurred(), "CAPI Machine should be created")
// Change authoritativeAPI from ClusterAPI to MachineAPI
Eventually(komega.Update(testMachine, func() {
testMachine.Spec.AuthoritativeAPI = machinev1beta1.MachineAuthorityMachineAPI
}), capiframework.WaitMedium, capiframework.RetryMedium).Should(Succeed(), "Should allow modification of authoritativeAPI from ClusterAPI to MachineAPI")
// Clean up the test machine
DeferCleanup(func() {
By("Cleaning up test machine")
// Delete the MAPI machine
if testMachine != nil {
mapiframework.DeleteMachines(ctx, cl, testMachine)
}
})
})
})
🤖 Prompt for AI Agents
In e2e/machine_migration_test.go around lines 260 to 285, the test currently
sets Spec.AuthoritativeAPI to MachineAuthorityClusterAPI again (no-op) but
should change it to MachineAuthorityMachineAPI to match the test description;
update the Eventually(komega.Update(...)) closure to assign
testMachine.Spec.AuthoritativeAPI = machinev1beta1.MachineAuthorityMachineAPI
and adjust the assertion message to reflect that the test is attempting to
change authoritativeAPI from ClusterAPI to MachineAPI.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huali9 I am removing this code based on your PR - /pull/366 , it seems it is repeated here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 23, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from racheljpg. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 23, 2025

@miyadav: This pull request references OCPCLOUD-2807 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

This pull request references OCPCLOUD-2566 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target either version "4.21." or "openshift-4.21.", but it targets "openshift-4.20" instead.

In response to this:

@huali9 @damdo @theobarberbany , thanks for already reviewing this earlier ( #343) , it became too long , hence made a new one , changes are same.

Summary by CodeRabbit

  • New Features

  • None

  • Tests

  • Added e2e machine-migration tests validating authoritative vs non-authoritative updates, immutability of protected fields, annotation protections, mirrored-machine existence, and helper utilities for provider-spec updates and admission assertions.

  • Chores

  • Updated vendoring and dependencies (added/removed indirect deps) and configured e2e to prefer a local operator replacement for builds.

  • Documentation

  • None

  • Refactor

  • None

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (7)
e2e/machine_migration_test.go (7)

288-294: Fix error message inconsistency.

The error message says "Expected failure attempting to update InstanceType" but this appears in the Success assertion. The message should indicate that the ProviderSpec update succeeded, not that a failure was expected.

Apply this diff to fix the error message:

-						})).To(Succeed(), "Expected failure attempting to update  InstanceType in provider spec")
+						})).To(Succeed(), "Failed to update InstanceType in provider spec")

307-312: Fix error message inconsistency.

Same issue as above - the error message incorrectly states "Expected failure" in a Success assertion.

Apply this diff:

-						})).To(Succeed(), "Expected failure attempting to update  AMI ID in provider spec")
+						})).To(Succeed(), "Failed to update AMI ID in provider spec")

314-322: Fix error message and add nil check for block devices.

Two issues:

  1. Error message inconsistency (same as above)
  2. The test assumes block devices exist but doesn't handle the case when the array is empty

Apply this diff:

 						Expect(updateAWSMachineProviderSpec(machine, func(providerSpec *machinev1beta1.AWSMachineProviderConfig) {
 							if len(providerSpec.BlockDevices) > 0 && providerSpec.BlockDevices[0].EBS != nil {
 								providerSpec.BlockDevices[0].EBS.Encrypted = ptr.To(false)
+							} else {
+								// Create a block device if none exists for testing
+								providerSpec.BlockDevices = []machinev1beta1.BlockDeviceMappingSpec{
+									{
+										EBS: &machinev1beta1.EBSBlockDeviceSpec{
+											Encrypted: ptr.To(false),
+										},
+									},
+								}
 							}
-						})).To(Succeed(), "Expected failure attempting to update  block device encryption in provider spec")
+						})).To(Succeed(), "Failed to update block device encryption in provider spec")

324-332: Fix error message and add nil check.

Same issues as the block device encryption test.

Apply this diff:

 						Expect(updateAWSMachineProviderSpec(machine, func(providerSpec *machinev1beta1.AWSMachineProviderConfig) {
 							if len(providerSpec.BlockDevices) > 0 && providerSpec.BlockDevices[0].EBS != nil {
 								providerSpec.BlockDevices[0].EBS.VolumeSize = ptr.To(testVolumeSize)
+							} else {
+								providerSpec.BlockDevices = []machinev1beta1.BlockDeviceMappingSpec{
+									{
+										EBS: &machinev1beta1.EBSBlockDeviceSpec{
+											VolumeSize: ptr.To(testVolumeSize),
+										},
+									},
+								}
 							}
-						})).To(Succeed(), "Expected failure attempting to update  VolumeSize in provider spec")
+						})).To(Succeed(), "Failed to update VolumeSize in provider spec")

334-342: Fix error message and add nil check.

Same pattern of issues.

Apply this diff:

 						Expect(updateAWSMachineProviderSpec(machine, func(providerSpec *machinev1beta1.AWSMachineProviderConfig) {
 							if len(providerSpec.BlockDevices) > 0 && providerSpec.BlockDevices[0].EBS != nil {
 								providerSpec.BlockDevices[0].EBS.VolumeType = ptr.To(testVolumeType)
+							} else {
+								providerSpec.BlockDevices = []machinev1beta1.BlockDeviceMappingSpec{
+									{
+										EBS: &machinev1beta1.EBSBlockDeviceSpec{
+											VolumeType: ptr.To(testVolumeType),
+										},
+									},
+								}
 							}
-						})).To(Succeed(), "Expected failure attempting to update  VolumeType in provider spec")
+						})).To(Succeed(), "Failed to update VolumeType in provider spec")

344-349: Fix error messages in remaining provider spec tests.

All remaining provider spec modification tests have the same "Expected failure" error message inconsistency.

Apply these fixes for consistency:

# AvailabilityZone test:
-						})).To(Succeed(), "Expected failure attempting to update  AvailabilityZone in provider spec")
+						})).To(Succeed(), "Failed to update AvailabilityZone in provider spec")

# Subnet test:
-						})).To(Succeed(), "Expected failure attempting to update  Subnet in provider spec")
+						})).To(Succeed(), "Failed to update Subnet in provider spec")

# SecurityGroups test:
-						})).To(Succeed(), "Expected failure attempting to update  SecurityGroups in provider spec")
+						})).To(Succeed(), "Failed to update SecurityGroups in provider spec")

# Tags test:
-						})).To(Succeed(), "Expected failure attempting to update  Tags in provider spec")
+						})).To(Succeed(), "Failed to update Tags in provider spec")

# capacityReservationId test:
-						})).To(Succeed(), "Expected failure attempting to update  capacityReservationId in provider spec")
+						})).To(Succeed(), "Failed to update capacityReservationId in provider spec")

Also applies to: 352-357, 360-367, 370-378, 381-386


708-721: Consider adding validation for nil provider spec.

The function returns an error for nil provider spec value, but it might be helpful to validate that the machine itself is not nil.

Consider adding a nil check for the machine parameter:

 func getAWSProviderSpecFromMachine(machine *machinev1beta1.Machine) (*machinev1beta1.AWSMachineProviderConfig, error) {
+	if machine == nil {
+		return nil, fmt.Errorf("machine is nil")
+	}
 	if machine.Spec.ProviderSpec.Value == nil {
 		return nil, fmt.Errorf("provider spec value is nil")
 	}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between f0cd053 and 91455a4.

📒 Files selected for processing (1)
  • e2e/machine_migration_test.go (5 hunks)
🔇 Additional comments (8)
e2e/machine_migration_test.go (8)

260-282: Test setup looks good with proper cleanup.

The test context correctly sets up a MAPI Machine with ClusterAPI authority and includes appropriate cleanup in DeferCleanup. The setup ensures the CAPI Machine is created and running before proceeding with tests.


284-286: LGTM! Appropriate verification for CAPI Machine existence.

The test correctly verifies that the non-authoritative MAPI Machine has a corresponding authoritative CAPI Machine mirror before attempting modification tests.


29-43: Well-structured test constants.

Good practice using const block for test values. The constants are descriptive and appropriately grouped.


660-675: Clean helper function for provider spec updates.

The updateAWSMachineProviderSpec function properly encapsulates the marshal/unmarshal logic and error handling.


676-689: Well-designed generic verification helper.

The verifyUpdatePrevented function provides good abstraction with sensible defaults for VAP error messages.


695-702: Good specialized helper with appropriate error messages.

The verifyAnnotationModificationPrevented function correctly includes both the generic VAP error and the specific annotation restriction message.


703-706: Simple and effective verification helper.

The verifyCAPIMachineExists function is appropriately concise for its purpose.


5-6: New imports are appropriate for the test functionality.

The added imports (encoding/json, mapi2capi conversion package, and ptr utilities) are all necessary for the new test functionality.

Also applies to: 15-15, 26-27

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 23, 2025

@miyadav: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-metal3-capi-techpreview b10077f link false /test e2e-metal3-capi-techpreview
ci/prow/e2e-openstack-ovn-techpreview e404940 link true /test e2e-openstack-ovn-techpreview
ci/prow/e2e-openstack-capi-techpreview e404940 link true /test e2e-openstack-capi-techpreview
ci/prow/okd-scos-e2e-aws-ovn 91455a4 link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@miyadav
Copy link
Member Author

miyadav commented Sep 23, 2025

/retest-required

@miyadav
Copy link
Member Author

miyadav commented Sep 23, 2025

/hold
Waiting for #364 to complete , looks like we could rebase with that could use some of those functions .

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 23, 2025
It("should prevent modification of encryption for block devices in non-authoritative MAPI Machine", func() {
verifyFieldModificationPrevented(testMapiMachine, "block device encryption", func(machine *machinev1beta1.Machine) {
Expect(updateAWSMachineProviderSpec(machine, func(providerSpec *machinev1beta1.AWSMachineProviderConfig) {
if len(providerSpec.BlockDevices) > 0 && providerSpec.BlockDevices[0].EBS != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have some guarantees that the block devices stanza in the provider spec exists? If it doesn't this'll silently pass? Is that the behaviour we want?

@theobarberbany
Copy link
Contributor

@coderabbitai full review

@coderabbitai
Copy link

coderabbitai bot commented Sep 23, 2025

✅ Actions performed

Full review triggered.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 23, 2025

@miyadav: This pull request references OCPCLOUD-2807 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

This pull request references OCPCLOUD-2566 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target either version "4.21." or "openshift-4.21.", but it targets "openshift-4.20" instead.

In response to this:

@huali9 @damdo @theobarberbany , thanks for already reviewing this earlier ( #343) , it became too long , hence made a new one , changes are same.

Summary by CodeRabbit

  • New Features

  • None

  • Tests

  • Added e2e machine-migration tests verifying that non-authoritative fields and protected annotations cannot be modified, ensuring mirrored-machine presence and admission rejections for disallowed updates.

  • Chores

  • Updated vendoring and dependency list; configured e2e to prefer a local operator replacement for builds.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
e2e/go.mod (1)

15-15: Local replace to ../: OK for CI; reconcile with Go workspaces plan

The explicit require plus replace github.com/openshift/cluster-capi-operator => ../ is reasonable to test against local sources. Once the workspace PR (#358) merges, consider removing this replace and relying on a Go workspace to avoid baking path-specific assumptions into the module.

Can you confirm CI sets GOWORK=off (as mentioned in comments) to avoid conflicts, and that go mod vendor was refreshed after this change?

Also applies to: 30-31

e2e/machine_migration_test.go (4)

325-331: Ensure VolumeSize mutation is not a no-op

Create EBS if needed so the update always changes something.

Apply this diff:

-                       Expect(updateAWSMachineProviderSpec(machine, func(providerSpec *machinev1beta1.AWSMachineProviderConfig) {
-                           if len(providerSpec.BlockDevices) > 0 && providerSpec.BlockDevices[0].EBS != nil {
-                               providerSpec.BlockDevices[0].EBS.VolumeSize = ptr.To(testVolumeSize)
-                           }
-                       })).To(Succeed(), "Expected failure attempting to update  VolumeSize in provider spec")
+                       Expect(updateAWSMachineProviderSpec(machine, func(providerSpec *machinev1beta1.AWSMachineProviderConfig) {
+                           if len(providerSpec.BlockDevices) == 0 {
+                               providerSpec.BlockDevices = []machinev1beta1.BlockDeviceMappingSpec{{}}
+                           }
+                           if providerSpec.BlockDevices[0].EBS == nil {
+                               providerSpec.BlockDevices[0].EBS = &machinev1beta1.EBSBlockDeviceSpec{}
+                           }
+                           providerSpec.BlockDevices[0].EBS.VolumeSize = ptr.To(testVolumeSize)
+                       })).To(Succeed(), "Failed to prepare provider spec mutation for VolumeSize")

335-341: Ensure VolumeType mutation is not a no-op

Same rationale as VolumeSize.

Apply this diff:

-                       Expect(updateAWSMachineProviderSpec(machine, func(providerSpec *machinev1beta1.AWSMachineProviderConfig) {
-                           if len(providerSpec.BlockDevices) > 0 && providerSpec.BlockDevices[0].EBS != nil {
-                               providerSpec.BlockDevices[0].EBS.VolumeType = ptr.To(testVolumeType)
-                           }
-                       })).To(Succeed(), "Expected failure attempting to update  VolumeType in provider spec")
+                       Expect(updateAWSMachineProviderSpec(machine, func(providerSpec *machinev1beta1.AWSMachineProviderConfig) {
+                           if len(providerSpec.BlockDevices) == 0 {
+                               providerSpec.BlockDevices = []machinev1beta1.BlockDeviceMappingSpec{{}}
+                           }
+                           if providerSpec.BlockDevices[0].EBS == nil {
+                               providerSpec.BlockDevices[0].EBS = &machinev1beta1.EBSBlockDeviceSpec{}
+                           }
+                           providerSpec.BlockDevices[0].EBS.VolumeType = ptr.To(testVolumeType)
+                       })).To(Succeed(), "Failed to prepare provider spec mutation for VolumeType")

684-689: Match all provided error substrings for stricter assertions

verifyUpdatePrevented ignores additional expected substrings. Use all when provided to verify both the VAP banner and the custom reason (e.g., annotation guard).

Apply this diff:

-       Eventually(komega.Update(machine, func() {
-           updateFunc(machine)
-       }), capiframework.WaitMedium, capiframework.RetryMedium).Should(
-           MatchError(ContainSubstring(expectedErrorSubstrings[0])),
-       )
+       // Build a matcher: if multiple substrings are provided, require all; otherwise match the single default.
+       var m types.GomegaMatcher
+       if len(expectedErrorSubstrings) <= 1 {
+           m = MatchError(ContainSubstring(expectedErrorSubstrings[0]))
+       } else {
+           ms := make([]types.GomegaMatcher, 0, len(expectedErrorSubstrings))
+           for _, s := range expectedErrorSubstrings {
+               ms = append(ms, MatchError(ContainSubstring(s)))
+           }
+           m = SatisfyAll(ms...)
+       }
+       Eventually(komega.Update(machine, func() {
+           updateFunc(machine)
+       }), capiframework.WaitMedium, capiframework.RetryMedium).Should(m)

Also applies to: 696-702


704-706: Make CAPIMachine existence check resilient

Race-safe check avoids flake right after creation.

Apply this diff:

-func verifyCAPIMachineExists(cl client.Client, machineName string) {
-    _, err := capiframework.GetMachine(cl, machineName, capiframework.CAPINamespace)
-    Expect(err).ToNot(HaveOccurred(), "CAPI Machine should exist")
-}
+func verifyCAPIMachineExists(cl client.Client, machineName string) {
+    Eventually(func() error {
+        _, err := capiframework.GetMachine(cl, machineName, capiframework.CAPINamespace)
+        return err
+    }, capiframework.WaitMedium, capiframework.RetryMedium).Should(Succeed(), "CAPI Machine should exist")
+}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 5d20769 and 91455a4.

⛔ Files ignored due to path filters (1)
  • e2e/go.sum is excluded by !**/*.sum
📒 Files selected for processing (3)
  • e2e/go.mod (4 hunks)
  • e2e/machine_migration_test.go (5 hunks)
  • vendor/modules.txt (1 hunks)
🔇 Additional comments (3)
vendor/modules.txt (1)

1123-1125: Vendored local replace (=> .): confirm this is intentional and temporary

Vendoring the root module via => . is fine for CI, but please confirm this is temporary until the workspace PR (#358) lands. Ensure go mod vendor was run after the change so the vendor tree is consistent.

e2e/go.mod (1)

62-62: New indirects: verify necessity and prune if unused

Double-check these new indirects are required by the new tests and not residuals from transient experiments. If unused, prune via go mod tidy to keep the graph lean.

Also applies to: 69-69, 118-118, 125-125

e2e/machine_migration_test.go (1)

315-321: Ensure a deterministic BlockDevices mutation (avoid no-op leading to flaky test)

If BlockDevices/EBS are absent, this code performs no change and the admission won’t trigger, causing flakes. Ensure the field path exists and always mutate.

Apply this diff:

-                       Expect(updateAWSMachineProviderSpec(machine, func(providerSpec *machinev1beta1.AWSMachineProviderConfig) {
-                           if len(providerSpec.BlockDevices) > 0 && providerSpec.BlockDevices[0].EBS != nil {
-                               providerSpec.BlockDevices[0].EBS.Encrypted = ptr.To(false)
-                           }
-                       })).To(Succeed(), "Expected failure attempting to update  block device encryption in provider spec")
+                       Expect(updateAWSMachineProviderSpec(machine, func(providerSpec *machinev1beta1.AWSMachineProviderConfig) {
+                           // Ensure we actually change something even if BlockDevices/EBS are missing
+                           if len(providerSpec.BlockDevices) == 0 {
+                               providerSpec.BlockDevices = []machinev1beta1.BlockDeviceMappingSpec{{}}
+                           }
+                           if providerSpec.BlockDevices[0].EBS == nil {
+                               providerSpec.BlockDevices[0].EBS = &machinev1beta1.EBSBlockDeviceSpec{}
+                           }
+                           // Toggle or set to ensure a diff
+                           if providerSpec.BlockDevices[0].EBS.Encrypted != nil {
+                               providerSpec.BlockDevices[0].EBS.Encrypted = ptr.To(!*providerSpec.BlockDevices[0].EBS.Encrypted)
+                           } else {
+                               providerSpec.BlockDevices[0].EBS.Encrypted = ptr.To(false)
+                           }
+                       })).To(Succeed(), "Failed to prepare provider spec mutation for block device encryption")

Comment on lines +289 to +293
verifyFieldModificationPrevented(testMapiMachine, "InstanceType", func(machine *machinev1beta1.Machine) {
Expect(updateAWSMachineProviderSpec(machine, func(providerSpec *machinev1beta1.AWSMachineProviderConfig) {
providerSpec.InstanceType = testInstanceType
})).To(Succeed(), "Expected failure attempting to update InstanceType in provider spec")
})
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick

Fix misleading assertion messages (“Expected failure …”)

These Expect(...).To(Succeed(), "Expected failure ...") messages contradict the assertion (we expect success preparing the mutation and failure on update). Adjust the messages to reflect preparation success.

Apply this diff (example for two occurrences; please mirror for the others accordingly):

@@
-                       })).To(Succeed(), "Expected failure attempting to update  InstanceType in provider spec")
+                       })).To(Succeed(), "Failed to prepare provider spec mutation for InstanceType")
@@
-                       })).To(Succeed(), "Expected failure attempting to update  AMI ID in provider spec")
+                       })).To(Succeed(), "Failed to prepare provider spec mutation for AMI ID")

Also applies to: 307-311, 345-349, 353-357, 361-367, 371-378, 382-386

🤖 Prompt for AI Agents
In e2e/machine_migration_test.go around lines 289-293 (and similarly at 307-311,
345-349, 353-357, 361-367, 371-378, 382-386), the Expect(...).To(Succeed(),
"Expected failure ...") messages are misleading because the call is supposed to
succeed when preparing the mutation; update the assertion messages to reflect
successful preparation (e.g., "Expected to prepare provider spec mutation" or
"Expected update helper to succeed") so the log matches the assertion intent for
each occurrence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants