OCPCLOUD-2775: add cluster api autoscaler integration enhancement #1736

elmiko · 2025-01-15T15:58:13Z

this enhancement describes how we will integrate the cluster autoscaler, and related controllers, with the Cluster API machine management layer.

openshift-ci-robot · 2025-01-15T15:58:25Z

@elmiko: This pull request references OCPCLOUD-2775 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

In response to this:

this enhancement describes how we will integrate the cluster autoscaler, and related controllers, with the Cluster API machine management layer.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci · 2025-01-15T15:59:47Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign ashcrow for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

elmiko · 2025-01-16T17:34:35Z

i'm not sure why it's barfing on the metadata

elmiko · 2025-01-16T18:45:46Z

figured it out, needed quoting on the github handles

JoelSpeed · 2025-01-22T15:38:40Z

enhancements/cluster-api/cluster-autoscaler-integration-with-openshift-cluster-api.md

+version and would allow us to drop some patches we are carrying. The Cluster
+API MachineSet sync controller will be updated to recognize when the
+Cluster Autoscaler has made a change to a Cluster API resource and then sync
+the change to the corresponding Machine API resource, regardless of which resource
+is authoritative.


Would be good to clarify exactly what kind of writes the CAS would be making, am I right in thinking that it's just the scale subresource?

yes, only the scale subresource.

So we could possibly gate scale subresource updates in a different way to other writes 🤔

going through this again, i think the capi provider can also write an annotation when it expects to delete a node. it will add an annotation to the machine as well so that capi knows which machine to remove.

reviewing the example role we have in the upstream, i'm going to need to review the code a little more to see which resources we expect to update.

rules: - apiGroups: - cluster.x-k8s.io resources: - machinedeployments - machinedeployments/scale - machines - machinesets - machinepools verbs: - get - list - update - watch

JoelSpeed · 2025-01-22T15:44:30Z

enhancements/cluster-api/cluster-autoscaler-integration-with-openshift-cluster-api.md

+locate the resource. The Cluster API MachineSet sync controller will be updated
+to ensure that when the Cluster Autoscaler Operator adds the autoscaling
+annotations that they are copied to any related resources, regardless of which
+is authoritative.


I think in this case, since we own the CAO, we don't necessarily need an exception within the CAPI sync controller, and could handle this in CAO. I would expect CAO to look at a MAPI MachineSet, and check if it's authoritative, and then apply the annotations correctly

Will it still be annotations on the CAPI side?

the annotations are available on the CAPI side, we will need to migrate a few of them. eventually we will want the infrastructure templates to carry the capacity info in their status field.

Do we have an estimated timeline on having the scale information directly in the status?

no timeline has ever been proposed. this feature was added as "opt-in", providers are not required to make these changes.

enhancements/cluster-api/cluster-autoscaler-integration-with-openshift-cluster-api.md

JoelSpeed · 2025-01-28T17:41:18Z

enhancements/cluster-api/cluster-autoscaler-integration-with-openshift-cluster-api.md

+  * A provider MachineSet controller has added the scale from zero annotations to a
+    non-authoritative record. This occurs when the Cluster API resource is marked as
+    authoritative but the Machine API resource is updated by the provider MachineSet controller.
+    In these cases the scale from zero annotations will be copied to the non-authoritative
+    Cluster API resource. The data from the MachineSet controller is only applied to
+    Machine API resources currently.


Is there an equivalent of this controller in CAPI? Or, if not, is it on the roadmap? If it is on the roadmap, we will want to ensure these controllers following the same pausing as the rest of the controllers

the last time i looked, most if not all of the providers also package a MachineSet actuators, but we actually want to promote a different behavior in the upstream. we want upstream providers to implement infrastructure template controllers to add the capacity information to the status on the infrastructure template, not as annotations on the MachineSet or MachineDeployment.

Do we have a timeline for seeing something like this directly in the upstream?

same as previous reply, a timelime was never proposed for this as it is "opt-in".

correct me if i'm wrong @JoelSpeed , but my understanding is that the MAPI machineset actuators will not be running when CAPI is enabled for a platform. so, there should be no need to reference these annotations?

There will be some period where both sets of controllers are running, but one side will be paused. Eventually we will want to remove the downstream implementations so having an upstream plan will allow us to plan when these could be removed

enhancements/cluster-api/cluster-autoscaler-integration-with-openshift-cluster-api.md

openshift-bot · 2025-03-28T01:15:06Z

Inactive enhancement proposals go stale after 28d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle stale.
Stale proposals rot after an additional 7d of inactivity and eventually close.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2025-04-04T08:45:36Z

Stale enhancement proposals rot after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Rotten proposals close after an additional 7d of inactivity.
Exclude this proposal from closing by commenting /lifecycle frozen.

If this proposal is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot · 2025-04-12T00:15:51Z

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

openshift-ci · 2025-04-12T00:16:02Z

@openshift-bot: Closed this PR.

In response to this:

Rotten enhancement proposals close after 7d of inactivity.

See https://github.com/openshift/enhancements#life-cycle for details.

Reopen the proposal by commenting /reopen.
Mark the proposal as fresh by commenting /remove-lifecycle rotten.
Exclude this proposal from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

elmiko · 2025-05-22T20:04:01Z

guess we lost track of this.

/reopen
/remove-lifecycle rotten

openshift-ci · 2025-05-22T20:04:20Z

@elmiko: Reopened this PR.

In response to this:

guess we lost track of this.

/reopen
/remove-lifecycle rotten

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci-robot · 2025-05-22T20:04:22Z

@elmiko: This pull request references OCPCLOUD-2775 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.20.0" version, but no target version was set.

In response to this:

this enhancement describes how we will integrate the cluster autoscaler, and related controllers, with the Cluster API machine management layer.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

elmiko · 2025-05-29T21:14:02Z

i'm coming back to review this again, will post an update in the near future.

openshift-ci · 2025-06-11T21:32:13Z

@elmiko: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

elmiko · 2025-06-12T12:42:01Z

i've updated the text in response to the comments here.

JoelSpeed

Given we are going to have a staggered approach where some platforms GA before others, do you think this will add significant complexity? We will need the new behaviour depending on whether a feature gate is enabled or not?

JoelSpeed · 2025-06-16T11:22:43Z

enhancements/cluster-api/cluster-autoscaler-integration-with-openshift-cluster-api.md

+annotations that they are copied to any related resources, regardless of which
+is authoritative.
+
+Update the Cluster API MachineSet sync controller to recognize the


Perhaps instead the sync controller needs an update to covert the MAPI keyed annotations to CAPI keyed annotations?

That way we wouldn't have the MAPI keys on the CAPI resources

an interesting thought. we currently have the CAO doing the conversions for us, but it might make sense to have the sync controller also be able to do this work when it is syncing the resources.

JoelSpeed · 2025-06-16T11:28:26Z

enhancements/cluster-api/cluster-autoscaler-integration-with-openshift-cluster-api.md

+The Cluster Autoscaler Operator will be changed to include logic that can detect
+the API group for any MachineSet that is referenced in the `scaleTargetRef` field
+of MachineAutoscaler resources. The change will instruct the Operator to search
+for records in the `openshift-cluster-api` namespace for resources with the
+`cluster.x-k8s.io` group, and to search in the `openshift-machine-api` namespace
+for resource with the `machine.openshift.io` group.


Every resource in the openshift-machine-api namespace will have a mirror in the openshift-cluster-api namespace, as such, I'd expect searching the openshift-cluster-api namespace to be enough. There's no need to look at the MAPI objects is there?

this is basically saying that a user select either the MAPI or CAPI resource in their MachineAutoscaler and the CAO will search the appropriate namespace.

so, if the user specifies a MAPI MachineSet, then the CAO would look in that namespace.

JoelSpeed · 2025-06-16T11:31:40Z

enhancements/cluster-api/cluster-autoscaler-integration-with-openshift-cluster-api.md

+Previously, only a Machine API MachineSet (i.e. a `MachineSet` kind in the
+`machine.openshift.io` API group) would be valid target of the `.spec.scaleTargetRef`
+field. After this enhancement is implemented, users may specify either a Machine
+API MachineSet or a Cluster API MachineSet in the `.spec.scaleTargetRef` field.


Given every MAPI machineset will have a CAPI mirror, when CAPI mirroring is enabled, do we actually need to care about the MAPI side?

i think we care to the extent that we want to allow users to continue using MAPI resources as the scaleTargetRef. basically, not breaking existing MachineAutoscalers.

JoelSpeed · 2025-06-16T11:32:36Z

enhancements/cluster-api/cluster-autoscaler-integration-with-openshift-cluster-api.md

+Note that the MachineAutoscaler named "worker-somezone-1" is targeting a Machine API
+MachineSet while "worker-somezone-2" is targeting a Cluster API MachineSet. The
+Cluster Autoscaler Operator will know by the `apiVersion` field whether to look
+for the resource in the `openshift-machine-api` or `openshift-cluster-api` namespace
+respectively.


Or it could always look at the ClusterAPI side?

it could. i think if we want to go down that route i'll need to rewrite portions of this enhancement to conform with the notion that we only look for the CAPI resources.

JoelSpeed · 2025-06-16T11:33:50Z

enhancements/cluster-api/cluster-autoscaler-integration-with-openshift-cluster-api.md

+resource. The sync controller will use the managed fields (i.e. `.metadata.managedFields`)
+of the specified MachineSet to determine if the Cluster Autoscaler Operator made
+changes to the annotations, and then replicate those appropriately. In this manner,


If we always write to the authoritative API, I don't think managedfields is required here?

yes, i'll revisit this part.

JoelSpeed · 2025-06-16T11:36:23Z

enhancements/cluster-api/cluster-autoscaler-integration-with-openshift-cluster-api.md

+The Cluster Autoscaler Operator will always update the authoritative MachineSet resource.
+If a user specifies a non-authoritative MachineSet as the `scaleTargetRef` of a
+MachineAutoscaler, the Cluster Autoscaler Operator will use the information on the
+MachineSet to determine which resource is authoritative and then update that resource.
+Through the MachineSet sync controller, the non-authoritative resource will be updated
+with the new information.


Maybe I'm getting confused, but we have two components here.

One is CAO, which we could make smart enough to understand the authoritative API?

And the other is KAS itself, which will always write to CAPI resources, right?

JoelSpeed · 2025-06-16T11:40:07Z

enhancements/cluster-api/cluster-autoscaler-integration-with-openshift-cluster-api.md

+matchConstraints:
+  resourceRules:
+  - apiGroups:   ["machine.openshift.io"]
+    apiVersions: ["v1beta1"]
+    operations:  ["UPDATE"]
+    resources:   ["MachineSet"]
+matchConditions:
+  # Only check requests coming from the cluster autoscaler service account.
+  - name: "check-only-cluster-autoscaler-service-account-requests"
+    expression: '(request.userInfo.username in [
+            "system:serviceaccount:openshift-machine-api:cluster-autoscaler",
+            ])'
+validations:
+  - expression: 'object.spec.replicas != oldObject.spec.replicas'
+    messageExpression: "Requested replica change is the same as current value"
+```


I think in particular we need to mention that this will be an exception to an existing VAP which prevents writes to non-authoritative resources. Have you spoken to @theobarberbany about this at all? He may be able to help test/write something

i will add some language about this being an exception to the existing VAP.

i did talk with Theo, these examples are directly inspired by his work.

JoelSpeed · 2025-06-16T11:42:25Z

enhancements/cluster-api/cluster-autoscaler-integration-with-openshift-cluster-api.md

+```
+matchConstraints:
+  resourceRules:
+  - apiGroups:   ["cluster.x-k8s.io"]
+    apiVersions: ["v1beta1"]
+    operations:  ["UPDATE"]
+    resources:   ["MachineSet"]
+matchConditions:
+  # Only check requests coming from the cluster autoscaler service account.
+  - name: "check-only-cluster-autoscaler-service-account-requests"
+    expression: '(request.userInfo.username in [
+            "system:serviceaccount:openshift-machine-api:cluster-autoscaler",
+            ])'
+validations:
+  - expression: 'object.spec.replicas != oldObject.spec.replicas'
+    messageExpression: "Requested replica change is the same as current value"
+```


Is this just a duplicate of the above block?

mostly. the difference is targeting CAPI instead of MAPI.

JoelSpeed · 2025-06-16T11:55:46Z

enhancements/cluster-api/cluster-autoscaler-integration-with-openshift-cluster-api.md

+To address this possible risk, the Cluster Autoscaler Operator will only write to the
+authoritative MachineSet resource. A user may create a MachineAutoscaler that references
+either the authoritative or non-authoritative resource in its `targetScaleRef` field,
+but the Cluster Autoscaler Operator will only update the authoritative resource.


This doesn't align with the error condition above? Are we going to take first created as the correct object? Can we include a VAP that prevents multiple objects using the same Name in the ref?

which part doesn't align? (i thought i had caught the change)

Are we going to take first created as the correct object?

yes, this is essentially how it will work.

Can we include a VAP that prevents multiple objects using the same Name in the ref?

that sounds like a good upgrade, i'll add something about it.

JoelSpeed · 2025-06-16T11:57:02Z

enhancements/cluster-api/cluster-autoscaler-integration-with-openshift-cluster-api.md

+Another approach to reducing confusion would be to allow only a single type of
+MachineSet resource (Machine API or Cluster API) to be specified as a target for
+autoscaling. This approach could work if the Cluster API resources are chosen as
+the target, but would represent a hard shift in the current MachineAutoscaler
+behavior and would require a conversion migration for all upgrades where cluster
+autoscaler is in use.


Would it? If the CAO and KAS always wrote to CAPI, that would be predictable at least?

i think the difference is between what the CAO is writing to versus what it is reading from.

afaict, we don't want to force a conversion for all MachineAutoscaler to use scaleTargetRef that points at the CAPI MachineSet. instead, we update the CAO to be smart enough to find the CAPI MachineSet when the user has specified a MAPI MachineSet in the scaleTargetRef.

so, i think we are converging on the idea that the CAO should always write to the CAPI resource (regardless of authority), but we need to be able to accept either the MAPI or CAPI resource from the user on the MachineAutoscaler resource.

the paragraph here is about taking the approach where we only allow the user to specify a CAPI MachineSet for the scaleTargetRef, which would require a conversion on upgrade.

elmiko · 2025-06-23T15:10:38Z

Given we are going to have a staggered approach where some platforms GA before others, do you think this will add significant complexity? We will need the new behaviour depending on whether a feature gate is enabled or not?

these are great questions, i need to spend some time thinking about this a little more.

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 15, 2025

openshift-ci bot requested review from Miciah and PratikMahajan January 15, 2025 15:59

elmiko force-pushed the add-cas-cao-capi-integration branch 2 times, most recently from 1c80b56 to 4554fb1 Compare January 16, 2025 16:39

elmiko force-pushed the add-cas-cao-capi-integration branch from 4554fb1 to 73bfea9 Compare January 16, 2025 18:51

JoelSpeed reviewed Jan 28, 2025

View reviewed changes

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 28, 2025

openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 4, 2025

openshift-ci bot closed this Apr 12, 2025

openshift-ci bot reopened this May 22, 2025

openshift-ci bot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label May 22, 2025

elmiko force-pushed the add-cas-cao-capi-integration branch from 73bfea9 to 32cf6bb Compare June 11, 2025 20:29

add cluster api autoscaler integration enhancement

667b2f7

elmiko force-pushed the add-cas-cao-capi-integration branch from 32cf6bb to 667b2f7 Compare June 11, 2025 21:12

JoelSpeed reviewed Jun 16, 2025

View reviewed changes

OCPCLOUD-2775: add cluster api autoscaler integration enhancement #1736

Are you sure you want to change the base?

OCPCLOUD-2775: add cluster api autoscaler integration enhancement #1736

Uh oh!

Conversation

elmiko commented Jan 15, 2025

Uh oh!

openshift-ci-robot commented Jan 15, 2025 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci bot commented Jan 15, 2025

Uh oh!

elmiko commented Jan 16, 2025

Uh oh!

elmiko commented Jan 16, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elmiko Feb 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

openshift-bot commented Mar 28, 2025

Uh oh!

openshift-bot commented Apr 4, 2025

Uh oh!

openshift-bot commented Apr 12, 2025

Uh oh!

openshift-ci bot commented Apr 12, 2025

Uh oh!

elmiko commented May 22, 2025

Uh oh!

openshift-ci bot commented May 22, 2025

Uh oh!

openshift-ci-robot commented May 22, 2025 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elmiko commented May 29, 2025

Uh oh!

openshift-ci bot commented Jun 11, 2025

Uh oh!

elmiko commented Jun 12, 2025

Uh oh!

JoelSpeed left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

openshift-ci-robot commented Jan 15, 2025 •

edited by openshift-ci bot

Loading

elmiko Feb 13, 2025 •

edited

Loading

openshift-ci-robot commented May 22, 2025 •

edited by openshift-ci bot

Loading