Skip to content

Conversation

justinsb
Copy link
Member

@justinsb justinsb commented Sep 30, 2025

Less hacky support for GCP, encode more of the logic into controllers.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 30, 2025
@k8s-ci-robot k8s-ci-robot requested a review from zetaab September 30, 2025 22:42
@justinsb justinsb force-pushed the clusterapi_controllers branch 2 times, most recently from c2c7c26 to 5e7c5bb Compare September 30, 2025 22:55
@hakman hakman requested review from hakman and removed request for olemarkus and zetaab September 30, 2025 23:27
@justinsb justinsb force-pushed the clusterapi_controllers branch 4 times, most recently from 80b82ec to 855ef49 Compare October 6, 2025 16:43
@k8s-ci-robot k8s-ci-robot added the area/provider/gcp Issues or PRs related to gcp provider label Oct 6, 2025
@justinsb justinsb force-pushed the clusterapi_controllers branch 2 times, most recently from 27f468b to 3e80afa Compare October 7, 2025 16:22
@justinsb
Copy link
Member Author

justinsb commented Oct 7, 2025

/test pull-kops-scenario-clusterapi-gcp

Let's try this new test :-)

@hakman
Copy link
Member

hakman commented Oct 7, 2025

/test pull-kops-scenario-clusterapi-gcp

@hakman hakman force-pushed the clusterapi_controllers branch from b8bc822 to a4fbb3d Compare October 7, 2025 18:06
@hakman
Copy link
Member

hakman commented Oct 7, 2025

/test pull-kops-scenario-clusterapi-gcp

@hakman hakman force-pushed the clusterapi_controllers branch from a4fbb3d to 5496816 Compare October 7, 2025 18:08
@hakman
Copy link
Member

hakman commented Oct 7, 2025

/test pull-kops-scenario-clusterapi-gcp

@justinsb justinsb force-pushed the clusterapi_controllers branch 2 times, most recently from a3ef085 to fb68367 Compare October 7, 2025 22:26
@justinsb
Copy link
Member Author

justinsb commented Oct 7, 2025

/test pull-kops-scenario-clusterapi-gcp

@justinsb justinsb force-pushed the clusterapi_controllers branch from fb68367 to b7f6dc3 Compare October 7, 2025 22:58
@justinsb
Copy link
Member Author

justinsb commented Oct 7, 2025

/test pull-kops-scenario-clusterapi-gcp

@justinsb
Copy link
Member Author

Rebased on top of #17658 and removed the "ignore version mismatch" hack (as that is what #17658 is supposed to solve

@justinsb
Copy link
Member Author

"Could not retrieve location for AWS bucket k8s-kops-prow"

/retest

(We might have deleted the bucket / moved the bucket / I may have broken this, but I figure it's worth a retest first)

@hakman
Copy link
Member

hakman commented Oct 16, 2025

"Could not retrieve location for AWS bucket k8s-kops-prow"

/retest

(We might have deleted the bucket / moved the bucket / I may have broken this, but I figure it's worth a retest first)

We switched the account. 🤣
CC @ameukam

@ameukam
Copy link
Member

ameukam commented Oct 16, 2025

yeah. we need to update the presubmits with a new buckets.

@ameukam
Copy link
Member

ameukam commented Oct 17, 2025

/retest

@hakman
Copy link
Member

hakman commented Oct 17, 2025

/test pull-kops-scenario-clusterapi-gcp

1 similar comment
@justinsb
Copy link
Member Author

/test pull-kops-scenario-clusterapi-gcp

justinsb and others added 7 commits October 18, 2025 12:58
Normally they are the same, but for CI builds they are different,
and kopsbase.Version is the consistent one to use for CI builds.

We also need to be careful not to conflate the version with
the docker image tag; image tags cannot contain '+' characters,
but our CI versions do.  We replace '+' with '_' for image tags.

Co-authored-by: Ciprian Hacman <[email protected]>
Less hacky support for GCP, encode more of the logic into controllers.

Co-authored-by: Ciprian Hacman <[email protected]>
Not the cleanest presentation, but this is the thing that is causing the most trouble right now.
@justinsb justinsb force-pushed the clusterapi_controllers branch from 6c3107f to f81a869 Compare October 18, 2025 13:03
@justinsb
Copy link
Member Author

/test pull-kops-scenario-clusterapi-gcp

I think the problem is that we are being inconsistent in specifying the CI env var, forcing CI=1 as an experiment

@justinsb
Copy link
Member Author

/test pull-kops-scenario-clusterapi-gcp

Looks like I'm missing a wait:

Error from server (InternalError): Internal error occurred: Internal error occurred: conversion webhook for cluster.x-k8s.io/v1beta1, Kind=MachineDeployment failed: Post "https://capi-webhook-service.capi-system.svc:443/convert?timeout=30s": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

@justinsb justinsb force-pushed the clusterapi_controllers branch from f81a869 to e9cdcd0 Compare October 18, 2025 17:12
@justinsb
Copy link
Member Author

/test pull-kops-scenario-clusterapi-gcp

@justinsb
Copy link
Member Author

justinsb commented Oct 18, 2025

/test pull-kops-scenario-clusterapi-gcp

I think we're getting there (🤞 ): we weren't passing the version to container builds, and we weren't passing KOPS_BASE_URL to kops-controller (so it was not using the version of nodeup that we built for capi configurations where kops-controller generates the user-data)

@justinsb
Copy link
Member Author

/test pull-kops-scenario-clusterapi-gcp

Calico did not become ready: system-node-critical pod "calico-node-7s5z8" is not ready (calico-node)

I've seen this a few times, I think it's a calico bug. I was able to remedy it by restarting calico when it happened "locally", but I think it is not related

@justinsb
Copy link
Member Author

Wow - I think that was it! Removing WIP :-)

@justinsb justinsb changed the title WIP: More cluster-api More support for cluster-api Oct 18, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/addons area/api area/documentation area/kops-controller area/nodeup area/provider/gcp Issues or PRs related to gcp provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants