✨ Write kubeadm control plane version file for workers to use to fetch the matching kubeadm binary#13433
✨ Write kubeadm control plane version file for workers to use to fetch the matching kubeadm binary#13433AcidLeroy wants to merge 10 commits intokubernetes-sigs:mainfrom
Conversation
|
@AcidLeroy: The label(s) DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @AcidLeroy. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
a240d31 to
52b943c
Compare
52b943c to
c5baa39
Compare
zarcen
left a comment
There was a problem hiding this comment.
Thanks for putting all this together @AcidLeroy. Have some suggestions
neolit123
left a comment
There was a problem hiding this comment.
i posted some comments on slack:
https://kubernetes.slack.com/archives/C8TSNPY4T/p1773251442281699
i think CAPI can just write the ClusterConfiguration on disk too.
@neolit123 I will provide an alternative solution by writing the |
|
@neolit123 Is this sort of what you are thinking: https://github.com/AcidLeroy/cluster-api/pull/3/changes |
yes, sgtm, but up to maintainers to decide. EDIT: in the slack thread we figured out that the CAPI v1beta2 ClusteConfiguration doesn't have the kubernetesVersion field, so my proposal is not useful. |
|
Rather than hard coding a file, we should look into providing a go template file (kubeadmconfig) and then in the kube bootsrtap config controller, we could render that file with the version directly into it. Look into to resolveFiles in kubeadm and templating the version into the "fetch kubeadm version" script. |
There was a problem hiding this comment.
This file might be in the gitignore file. Should double check.
There was a problem hiding this comment.
It definitely is on gitignore. If I create a file with a similar name it's ignored and I don't see why it should be different for this file.
Let's do the following
- Let's move the file into a sub-folder called
cluster-template-topology-kubeadm - Let's adjust the
generate-e2e-templates-mainMakefile target to copy the file up one level (Please make sure it is at the right place in the Makefile target based on alphabetic ordering of the target file name)
That way it fits into our regular folder structure and is easier to find
b9466e1 to
97526b7
Compare
co-author: Wei-Chen Chen <[email protected]>
97526b7 to
dbc7e68
Compare
| // getControlPlaneVersionForJoin returns the control plane (cluster) version from the cluster's ControlPlaneRef, | ||
| // e.g. KubeadmControlPlane.spec.version. Returns empty string if the cluster has no ControlPlaneRef or the version | ||
| // cannot be read (e.g. control plane not found or does not support version). Used for worker join so that | ||
| // a 1.34 node uses kubeadm 1.35 when the control plane is at 1.35, for example. | ||
| func (r *KubeadmConfigReconciler) getControlPlaneVersionForJoin(ctx context.Context, scope *Scope) string { | ||
| if !scope.Cluster.Spec.ControlPlaneRef.IsDefined() { | ||
| return "" | ||
| } | ||
| controlPlane, err := external.GetObjectFromContractVersionedRef(ctx, r.Client, scope.Cluster.Spec.ControlPlaneRef, scope.Cluster.Namespace) | ||
| if err != nil { | ||
| scope.V(4).Info("Could not get control plane for version, falling back to machine version", "error", err) | ||
| return "" | ||
| } | ||
| cpVersion, err := contract.ControlPlane().Version().Get(controlPlane) | ||
| if err != nil { | ||
| if !errors.Is(err, contract.ErrFieldNotFound) { | ||
| scope.V(4).Info("Could not get control plane version, falling back to machine version", "error", err) | ||
| } | ||
| return "" | ||
| } | ||
| if cpVersion == nil { | ||
| return "" | ||
| } | ||
| return *cpVersion | ||
| } |
There was a problem hiding this comment.
question: I notice this falls back to the machine version for any error (not found, permission denied, network issue, etc.). Is that intentional for all error types, or would it be worth distinguishing "control plane not found / field not present" (expected) from unexpected failures?
Just wondering whether masking unexpected errors here could make debugging harder down the road.
There was a problem hiding this comment.
Yeah, I think there are some error states here that we can requeue for and only fall back to machine version as an absolute last resort. I'll push some changes up shortly with an alternative to what I have here.
There was a problem hiding this comment.
@zjs, I introduced some changes to include another condition so that we can surface any issues with getting the CP version. In this case, we only fall back to the machine version if we absolutely have to, and we'll be able to see what the issue is via the status conditions. LMK what you think! Thanks!
There was a problem hiding this comment.
Curious to see how others feel, but personally, I like it!
There was a problem hiding this comment.
Adding an entire condition just to surface this error seems to much to me.
We should be very careful with introducing new conditions
Based on some of my other comments we might not need this anymore anyway
|
Let's use hold instead of draft so we can run CI /ok-to-test |
| type Encoding string | ||
|
|
||
| // FileContentFormat specifies how file content is interpreted after resolving content/contentFrom and before writing bootstrap data. | ||
| // +kubebuilder:validation:Enum="";go-template |
There was a problem hiding this comment.
I think I would use "raw" and "template" as enum values.
I would not make "" a valid value. This is already covered because the field is optional.
I would use template instead of go-template for consistency with other parts of our API: JSONPatchValue.Template & Naming.Template.
(I think after my suggestions are implemented the linter should also be happy)
| ), | ||
| ) | ||
| } | ||
| if file.ContentFormat != "" && file.ContentFormat != FileContentFormatGoTemplate { |
There was a problem hiding this comment.
This "manual" validation is not necessary. This is already covered via the enum kubebuilder markers.
See CRD schema:
enum:
- ""
- go-template| // (semver.String(), no "v" prefix). For a worker joining a cluster, that version is the control plane | ||
| // Kubernetes version when the controller can read it; otherwise the Machine's version. | ||
| // +optional | ||
| ContentFormat FileContentFormat `json:"contentFormat,omitempty"` |
There was a problem hiding this comment.
Please align the template part of the godoc to what we did here:
cluster-api/api/core/v1beta2/machinedeployment_types.go
Lines 391 to 410 in 5cfc702
There was a problem hiding this comment.
nit: let's move this field after content / contentFrom so all the options about file content are grouped toghether
| // KubernetesVersion is the effective Kubernetes version for bootstrap data (semver.String(), no "v" prefix). | ||
| // For a worker Machine joining a cluster, this is the control plane Kubernetes version when the controller | ||
| // can read it; otherwise the Machine's version. | ||
| KubernetesVersion string |
There was a problem hiding this comment.
I think we should:
- provide it as controlPlane.version similar to builtin variables, e.g. via
map[string]interface{}{
"controlPlane": map[string]interface{}{
"version": version,
},
})Accordingly we should not fallback to the Machine version
- Provide the version with v prefix otherwise it's inconsistent with the version field in the Cluster object and also with builtin variables
There was a problem hiding this comment.
+1 to have a well defined and stable semantic for controlPlane.version (no fallback logic); if this semantic will not be consistent with the one in buildin variables, we should consider adding a suffix, but we can iterate on this.
notably, having a well defined and stable semantic also allows to ensure a consistent the same behaviour for file templates both for init/join control plane and join workers (less magic knobs/hidden behaviours, easier to understand for users and to maintain for us)
There was a problem hiding this comment.
refactored to have controlPlane.version struct
| */ | ||
|
|
||
| // Package bootstrapfiles contains helpers for KubeadmConfig spec.files processing. | ||
| package bootstrapfiles |
There was a problem hiding this comment.
Looks like this is only used in the KubeadmConfig controller. If there is no need for a separate package I wouldn't create one just for this
There was a problem hiding this comment.
moved to bootstrap/kubeadm/internal/controllers/template.go to avoid unnecessary new package
| - sourcePath: "../data/infrastructure-docker/main/cluster-template-topology-autoscaler.yaml" | ||
| - sourcePath: "../data/infrastructure-docker/main/cluster-template-topology.yaml" | ||
| - sourcePath: "../data/infrastructure-docker/main/cluster-template-topology-taints.yaml" | ||
| - sourcePath: "../data/infrastructure-docker/main/cluster-template-topology-kubeadm-version.yaml" |
There was a problem hiding this comment.
This list is (or at least should be) alphabetically ordered, please adjust accordingly
(same for the ClusterClass file below)
| arch=$$(uname -m) | ||
| case "$$arch" in | ||
| x86_64) arch="amd64" ;; | ||
| aarch64|arm64) arch="arm64" ;; |
| echo "Fetching kubeadm $${version} ($${arch}) from $${url}" | ||
| echo "fetch-kubeadm.sh: curl -fLsS -o <tmpfile> $$url" | ||
| tmp=$$(mktemp -p /tmp kubeadm.XXXXXX) | ||
| if curl -fLsS -o "$$tmp" "$$url"; then |
There was a problem hiding this comment.
Should we use something like this? --retry 5 --retry-all-errors (would like to avoid flakes in CI)
| preKubeadmCommands: | ||
| - | | ||
| command -v curl >/dev/null 2>&1 || (apt-get update && apt-get install -y --no-install-recommends curl ca-certificates && rm -rf /var/lib/apt/lists/*) | ||
| - 'sh /run/cluster-api/fetch-kubeadm.sh 2>&1 | tee /var/log/fetch-kubeadm.log || true' |
There was a problem hiding this comment.
Looks like the bootstrap will continue even if the script fails (but not sure)
Would be good to fail bootstrap if the script fails if that's doable with reasonable effort
| "k8s.io/utils/ptr" | ||
| ) | ||
|
|
||
| var _ = Describe("When a worker joins during a control plane upgrade [ClusterClass]", Label("ClusterClass"), func() { |
There was a problem hiding this comment.
Maybe rephrase to "When a worker joins with kubeadm with an older version than the control plane [ClusterClass]"
| // +kubebuilder:validation:Enum=base64;gzip;gzip+base64 | ||
| type Encoding string | ||
|
|
||
| // FileContentFormat specifies how file content is interpreted after resolving content/contentFrom and before writing bootstrap data. |
There was a problem hiding this comment.
nit: This enum is currently defined between Encoding and the encoding constants. Let's please move this to either above or below Encoding
|
@AcidLeroy: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| // FileContentFormatRaw means content is used verbatim. | ||
| FileContentFormatRaw FileContentFormat = "raw" | ||
| // FileContentFormatTemplate means content is rendered as a Go text/template. | ||
| // Available template variables are documented by the kubeadm bootstrap provider. |
There was a problem hiding this comment.
Nit
| // Available template variables are documented by the kubeadm bootstrap provider. |
Available template variables are documented down below in this file where FileContentFormat is used
(same in v1beta2)
| clusterv1beta1 "sigs.k8s.io/cluster-api/api/core/v1beta1" | ||
| ) | ||
| import clusterv1beta1 "sigs.k8s.io/cluster-api/api/core/v1beta1" | ||
|
|
There was a problem hiding this comment.
nit
(no need of additional empty lines)
|
|
||
| import clusterv1 "sigs.k8s.io/cluster-api/api/core/v1beta2" | ||
|
|
||
|
|
There was a problem hiding this comment.
nit
(no need of additional empty lines)
| // getControlPlaneVersionForJoin returns the control plane (cluster) version from the cluster's ControlPlaneRef, | ||
| // e.g. KubeadmControlPlane.spec.version. Returns ("", nil) if the cluster has no ControlPlaneRef or the referenced | ||
| // control plane does not expose spec.version (ErrFieldNotFound or unset); callers should fall back to the machine's | ||
| // Kubernetes version only in those cases. Returns an error if the control plane object cannot be fetched or if | ||
| // the version field cannot be read for any other reason. |
There was a problem hiding this comment.
I'm still a little bit confused by the semantic of this func and its intended usage.
If the goal of this func is to return the value for the controlPlane.version variable, then there should not be any fallback in the machine's Kubernetes version (and the usage should be the same in init, join workers and join control plane).
Pushing this a little bit further, the entire template resolution should become a internal implementation detail of resolveFile.
I would expect something similar to
| // getControlPlaneVersionForJoin returns the control plane (cluster) version from the cluster's ControlPlaneRef, | |
| // e.g. KubeadmControlPlane.spec.version. Returns ("", nil) if the cluster has no ControlPlaneRef or the referenced | |
| // control plane does not expose spec.version (ErrFieldNotFound or unset); callers should fall back to the machine's | |
| // Kubernetes version only in those cases. Returns an error if the control plane object cannot be fetched or if | |
| // the version field cannot be read for any other reason. | |
| // getControlPlaneVersionForJoin returns the control plane (cluster) version from the cluster's ControlPlaneRef, | |
| // e.g. KubeadmControlPlane.spec.version. Returns ("", nil) if the cluster has no ControlPlaneRef or the referenced | |
| // control plane does not expose spec.version (ErrFieldNotFound or unset). | |
| // Returns an error if the control plane object cannot be fetched or if | |
| // the version field cannot be read for any other reason. |
There was a problem hiding this comment.
Also thinking more about it, it makes sense to just error out instead of falling back to machine's version. It'd be some seriously trouble for a joining worker when it cannot fetch controlPlane.version. Will address this in next commit
There was a problem hiding this comment.
should this be something like kubeadm_join_old_nodes (so we focus on outcomes)?
same comment apply to the func/type name
What this PR does / why we need it
Kubernetes allows some skew between the control plane and kubelets, but kubeadm's own skew policy requires the kubeadm binary used for
kubeadm jointo match the kubeadm used when the cluster was created or last upgraded on that path—so you cannot rely on an older kubeadm on the worker when the control plane is newer.That conflicts with real Cluster API flows (e.g. scaling or remediating workers still on an older Kubernetes while the control plane has moved ahead), as discussed in #13315.
This PR:
KubeadmControlPlanewhen available) and uses it when generating join bootstrap data so join config matches the cluster the node is joining. If the control plane object cannot be read while acontrolPlaneRefis set, reconciliation fails and status conditions surface the error (no silent fallback to the Machine version in that case). When there is no control plane ref or the referenced object does not expose a version, the controller falls back to the Machine's Kubernetes version as before.KubeadmConfig: theControlPlaneKubernetesVersionAvailablecondition stays True for both success paths, but Reason (and Message) distinguish version read from the control plane reference vs version taken from the Machine because the reference is unset or has no version—so operators can see at a glance whether the skew contract is being driven by the cluster control plane or the worker.contentFormatfield tospec.fileswith two values:"raw"(default, content used verbatim as today) and"template"(content rendered as Gotext/template). Template data includes{{ .controlPlane.version }}, so operators can wire their own steps (scripts, package installs, downloads) to install a kubeadm binary that matches the control plane beforekubeadm joinruns—without CAPI prescribing a single install mechanism.KubeadmControlPlanewhere needed for version resolution.cluster-template-topology-kubeadm-versionmoved into its own kustomize sub-folder (consistent with other topology templates) and wired into thegenerate-e2e-templates-mainMakefile target.spec.files, controller tests for the new condition reasons (including scheme fix forWorkerJoinWithControlPlaneRef), and E2E coverage (KubeadmVersionOnJoin+clusterclass-quick-start-kubeadm-version) demonstrating the pattern end-to-end.sequenceDiagram participant CP as Control plane (newer K8s) participant BC as Kubeadm bootstrap controller participant W as Worker (older image / kubelet) BC->>CP: Read KubeadmControlPlane.spec.version CP-->>BC: e.g. v1.35.0 BC->>BC: Build join data + render spec.files templates BC->>W: Render spec.files (contentFormat: template) e.g. fetch script with {{ .controlPlane.version }} Note over W: preKubeadmCommands (operator-defined) W->>W: Install/fetch kubeadm matching CP version W->>CP: kubeadm join (binary matches policy)Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when PR gets merged):Related to #13315
/area bootstrap
/area test