Skip to content

Conversation

justinsb
Copy link
Contributor

@justinsb justinsb commented Aug 23, 2025

We upgrade CAPI libraries to 1.11.0, however we continue to use v1beta1 semantics and versions (except in our tests).

This should be an internal version bump, not user facing.


NONE

@k8s-ci-robot k8s-ci-robot added the do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. label Aug 23, 2025
Copy link

netlify bot commented Aug 23, 2025

Deploy Preview for kubernetes-sigs-cluster-api-gcp ready!

Name Link
🔨 Latest commit 45b9880
🔍 Latest deploy log https://app.netlify.com/projects/kubernetes-sigs-cluster-api-gcp/deploys/68c170717ae7d80008d004ec
😎 Deploy Preview https://deploy-preview-1509--kubernetes-sigs-cluster-api-gcp.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 23, 2025
@justinsb
Copy link
Contributor Author

We discussed this in CAPI office hours this week. It's a big process https://cluster-api.sigs.k8s.io/developer/providers/migrations/v1.10-to-v1.11, but we should be able to start gradually by just bumping the dependency and sticking with v1beta1. A "real" integration will start using v1beta2 upstream, and then bump our own types to v1beta2 (the big change is status.conditions, I believe)

Have we documented our branching strategy anywhere? I would imagine that the release-1.10 branch maps to CAPI 1.10, and we cherry-pick fixes for 1.10 to that branch. If we don't have a doc, I'm happy to send a PR

@justinsb justinsb force-pushed the bump_capi_version_to_0_11 branch from 050fd9a to ed6b043 Compare August 23, 2025 15:23
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 23, 2025
@k8s-ci-robot k8s-ci-robot added the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Aug 25, 2025
@justinsb justinsb force-pushed the bump_capi_version_to_0_11 branch from 4bb63a7 to 459eab7 Compare August 25, 2025 13:59
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Aug 25, 2025
@salasberryfin
Copy link
Contributor

Thanks @justinsb.

Looks like this bump also requires a number of changes to remove deprecated functions. Let me know if I can help collaborating on this PR.

@cpanato
Copy link
Member

cpanato commented Aug 27, 2025

there are a few issues

./gke.go:117:19: cannot use framework.DiscoveryAndWaitForCluster(ctx, framework.DiscoveryAndWaitForClusterInput{…}, input.WaitForClusterIntervals...) (value of type *"sigs.k8s.io/cluster-api/api/core/v1beta2".Cluster) as *"sigs.k8s.io/cluster-api/api/core/v1beta1".Cluster value in assignment
./gke.go:127:24: cannot use framework.DiscoveryAndWaitForMachinePools(ctx, framework.DiscoveryAndWaitForMachinePoolsInput{…}, input.WaitForMachinePools...) (value of type []*"sigs.k8s.io/cluster-api/api/core/v1beta2".MachinePool) as []*"sigs.k8s.io/cluster-api/api/core/v1beta1".MachinePool value in assignment
./gke.go:130:12: cannot use result.Cluster (variable of type *"sigs.k8s.io/cluster-api/api/core/v1beta1".Cluster) as *"sigs.k8s.io/cluster-api/api/core/v1beta2".Cluster value in struct literal
./conformance_test.go:83:26: cannot use result.Cluster (variable of type *"sigs.k8s.io/cluster-api/api/core/v1beta2".Cluster) as *"sigs.k8s.io/cluster-api/api/core/v1beta1".Cluster value in struct literal
./e2e_gke_test.go:128:32: cannot use result.Cluster (variable of type *"sigs.k8s.io/cluster-api/api/core/v1beta1".Cluster) as *"sigs.k8s.io/cluster-api/api/core/v1beta2".Cluster value in struct literal
./e2e_gke_test.go:130:32: cannot use result.MachinePools (variable of type []*"sigs.k8s.io/cluster-api/api/core/v1beta1".MachinePool) as []*"sigs.k8s.io/cluster-api/api/core/v1beta2".MachinePool value in struct literal
./e2e_gke_test.go:137:32: cannot use result.Cluster (variable of type *"sigs.k8s.io/cluster-api/api/core/v1beta1".Cluster) as *"sigs.k8s.io/cluster-api/api/core/v1beta2".Cluster value in struct literal
./e2e_gke_test.go:139:32: cannot use result.MachinePools (variable of type []*"sigs.k8s.io/cluster-api/api/core/v1beta1".MachinePool) as []*"sigs.k8s.io/cluster-api/api/core/v1beta2".MachinePool value in struct literal
./e2e_test.go:70:26: cannot use result.Cluster (variable of type *"sigs.k8s.io/cluster-api/api/core/v1beta2".Cluster) as *"sigs.k8s.io/cluster-api/api/core/v1beta1".Cluster value in struct literal

@damdo
Copy link
Member

damdo commented Aug 29, 2025

Following up from my thoughts here: #1514 (comment)

There are other places where we need to bump go.
I suggest we keep this PR just for bumping t.Context()(we can park it for a little), and have a separate PR like usual where we do the Go v1.24/CAPI v1.11/K8s 1.33 bumps all together (e.g. for the previous bumps we did that in #1459), I suggest we do that in @justinsb's #1509 to keep everything initially confined there.

We could rename this PR to Bump CAPI to v1.11, k8s 1.33, go 1.24 and work our way into adding all those bump changes here.

Here are some places where we need bumping:

@justinsb justinsb force-pushed the bump_capi_version_to_0_11 branch from 65acc9c to ac0797a Compare August 29, 2025 09:08
@justinsb
Copy link
Contributor Author

Thanks @damdo - I think I got them all! Except:

cloudbuild-nightly.yaml and cloudbuild.yaml

I don't think those are per go version. They could be bumped, but hopefully we don't have to because it looks like they have a different update policy, which will make auto-bumping hard.

metadata.yaml (here) add a new release to the releaseSeries (normally matching the CAPI release)

I hope we don't have to do this until we actually release, because our contract is still v1beta1 but we are expected to update to v1beta2 as part of moving to CAPI 1.11. There's a note there saying not to do this until we actually release, and I think we want to do more before releasing. This PR is just getting us onto the CAPI 1.11.x libraries.

@justinsb
Copy link
Contributor Author

Thanks @justinsb.

Looks like this bump also requires a number of changes to remove deprecated functions. Let me know if I can help collaborating on this PR.

Thank you @salasberryfin - any collaboration would be wonderful. I propose keeping this PR fairly minimal - just bumping to CAPG 1.11.0 but not yet adopting v1beta2 APIs.

I think the deprecated functions are ones that I introduced and marked deprecated! They are needed (temporarily) while we continue to use v1beta1 APIs internally. Moving everything to v1beta2 would (I think) be a more invasive code change (I'll happily review though if you want to add that - I think you need to bump CAPI to 1.11 first which is what I'm trying to do in this PR). The CAPI v1beta1 packages are marked deprecated also, and yes - we need to move to v1beta2.

@justinsb justinsb force-pushed the bump_capi_version_to_0_11 branch 2 times, most recently from a39a325 to b1d9c05 Compare August 29, 2025 09:31
go.mod Outdated
module sigs.k8s.io/cluster-api-provider-gcp

go 1.23.7
go 1.24.6
Copy link
Member

@damdo damdo Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer we did bump the minimum required minor version, but not patch version.
This has caused a LOT of pain downstream, and not only in our product. (i.e. just doing go 1.24.0 instead of 1.24.z)

Core CAPI also has my same take for CAPI v1.11, as you can see here

Copy link
Contributor Author

@justinsb justinsb Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great to figure this out, that's why I sent #1515 because I'm confused! Do you have any links to the problems caused? I remember a problem a few years ago in k/k where we found a regression in go on arm x86 32 bit (IIRC).

That said, I do think we should be using toolchain directives

go 1.24

toolchain go1.24.6

On other projects users insist that we run the latest golang because otherwise their vulnerability scanners flag it. I can see a case for not running 1.25.0 while 1.24.x is supported, but if we're using 1.24.x I think we want to be using 1.24.6 to avoid getting flagged by every security team.

Copy link
Member

@damdo damdo Aug 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy with using the go toolchain xx directive bumped to the latest patch version of the version specified in the go xx directive (staying to a .0 patch release for the whole time).

Do you have any links to the problems caused?

The problem with bumping the patch version of the go xx directive is that it forces all importers of this module to use that same or a higher patch version to build it.

Which it is fine in most of the projects if they can keep up with the go patch release cadence, but it is hard in environments/projects where the go compiler is bumped once every minor go release, or at least where the bumping cadence is not that high for reasons.
Downstream we explicitly disable toolchain for achieving reproducible/consistent builds, so we can't rely on the go compiler "dynamic version upgrade" feature (or whatever it is called 😄)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I like that approach!

We rely on some helper functions that have moved to v1beta2.

This should work because we should be able to use core CAPI v1beta2,
before fully upgrading CAPG to v1beta2 semantics.
@justinsb justinsb force-pushed the bump_capi_version_to_0_11 branch from c30275d to 86fd846 Compare September 10, 2025 12:34
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 10, 2025
@justinsb justinsb force-pushed the bump_capi_version_to_0_11 branch from 86fd846 to 45b9880 Compare September 10, 2025 12:34
@damdo
Copy link
Member

damdo commented Sep 10, 2025

/test pull-cluster-api-provider-gcp-test pull-cluster-api-provider-gcp-apidiff

@k8s-ci-robot
Copy link
Contributor

@justinsb: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-provider-gcp-apidiff 45b9880 link false /test pull-cluster-api-provider-gcp-apidiff

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@justinsb
Copy link
Contributor Author

So I rebased and we are passing e2e with kube 1.33 🎉

The apidiff is "real" - these are API changes, though I think these are mostly just the package-path changing. I believe that all the external facing changes are no-op package-path changes....

@salasberryfin
Copy link
Contributor

Thanks @justinsb! With #1256 and #1257 merged and this passing CI checks, I also think it is safe to merge.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 11, 2025
@damdo
Copy link
Member

damdo commented Sep 11, 2025

@cpanato do we need to run any other e2e or conformance on this before merging?

@damdo
Copy link
Member

damdo commented Sep 11, 2025

/test ?

@k8s-ci-robot
Copy link
Contributor

@damdo: The following commands are available to trigger required jobs:

/test pull-cluster-api-provider-gcp-build
/test pull-cluster-api-provider-gcp-e2e-test
/test pull-cluster-api-provider-gcp-make
/test pull-cluster-api-provider-gcp-test
/test pull-cluster-api-provider-gcp-verify

The following commands are available to trigger optional jobs:

/test pull-cluster-api-provider-gcp-apidiff
/test pull-cluster-api-provider-gcp-capi-e2e
/test pull-cluster-api-provider-gcp-conformance
/test pull-cluster-api-provider-gcp-conformance-ci-artifacts
/test pull-cluster-api-provider-gcp-coverage
/test pull-cluster-api-provider-gcp-e2e-workload-upgrade

Use /test all to run the following jobs that were automatically triggered:

pull-cluster-api-provider-gcp-apidiff
pull-cluster-api-provider-gcp-build
pull-cluster-api-provider-gcp-e2e-test
pull-cluster-api-provider-gcp-make
pull-cluster-api-provider-gcp-test
pull-cluster-api-provider-gcp-verify

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@damdo
Copy link
Member

damdo commented Sep 11, 2025

It looks like last time we ran these: #1459 (comment)

So let's run them again.

/test pull-cluster-api-provider-gcp-conformance
/test pull-cluster-api-provider-gcp-capi-e2e

@cpanato
Copy link
Member

cpanato commented Sep 11, 2025

we should run the pull-cluster-api-provider-gcp-conformance-ci-artifacts but we need to fix that test :/

so just extra conformance is ok for now

@damdo
Copy link
Member

damdo commented Sep 11, 2025

we should run the pull-cluster-api-provider-gcp-conformance-ci-artifacts but we need to fix that test :/

so just extra conformance is ok for now

Would you be able to create an issue for that? So we don't forget :)

@cpanato
Copy link
Member

cpanato commented Sep 11, 2025

that will be awesome

@damdo
Copy link
Member

damdo commented Sep 11, 2025

Created an issue for that here: #1529

@damdo
Copy link
Member

damdo commented Sep 11, 2025

pull-cluster-api-provider-gcp-conformance
and pull-cluster-api-provider-gcp-capi-e2e
passed

Let's get this merged

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 11, 2025
@k8s-ci-robot k8s-ci-robot merged commit 77f3703 into kubernetes-sigs:main Sep 11, 2025
17 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants