The spec.topology
field added to the Cluster object as part of ClusterClass allows changes made on the Cluster to be propagated across all relevant objects. This means the Cluster object can be used as a single point of control for making changes to objects that are part of the Cluster, including the ControlPlane and MachineDeployments.
A managed Cluster can be used to:
- Upgrade a Cluster
- Scale a ControlPlane
- Scale a MachineDeployment
- Add a MachineDeployment
- Use variables in a Cluster
- Rebase a Cluster to a different ClusterClass
- Upgrading Cluster API
- Tips and tricks
Using a managed topology the operation to upgrade a Kubernetes cluster is a one-touch operation.
Let's assume we have created a CAPD cluster with ClusterClass and specified Kubernetes v1.21.2 (as documented in the Quick Start guide). Specifying the version is done when running clusterctl generate cluster
. Looking at the cluster, the version of the control plane and the MachineDeployments is v1.21.2.
> kubectl get kubeadmcontrolplane,machinedeployments
NAME CLUSTER INITIALIZED API SERVER AVAILABLE REPLICAS READY UPDATED UNAVAILABLE AGE VERSION
kubeadmcontrolplane.controlplane.cluster.x-k8s.io/clusterclass-quickstart-XXXX clusterclass-quickstart true true 1 1 1 0 2m21s v1.21.2
NAME CLUSTER REPLICAS READY UPDATED UNAVAILABLE PHASE AGE VERSION
machinedeployment.cluster.x-k8s.io/clusterclass-quickstart-linux-workers-XXXX clusterclass-quickstart 1 1 1 0 Running 2m21s v1.21.2
To update the Cluster the only change needed is to the version
field under spec.topology
in the Cluster object.
Change 1.21.2
to 1.22.0
as below.
kubectl patch cluster clusterclass-quickstart --type json --patch '[{"op": "replace", "path": "/spec/topology/version", "value": "v1.22.0"}]'
The patch will make the following change to the Cluster yaml:
spec:
topology:
class: quick-start
+ version: v1.22.0
- version: v1.21.2
Important Note: A +2 minor Kubernetes version upgrade is not allowed in Cluster Topologies. This is to align with existing control plane providers, like KubeadmControlPlane provider, that limit a +2 minor version upgrade. Example: Upgrading from 1.21.2
to 1.23.0
is not allowed.
The upgrade will take some time to roll out as it will take place machine by machine with older versions of the machines only being removed after healthy newer versions come online.
To watch the update progress run:
watch kubectl get kubeadmcontrolplane,machinedeployments
After a few minutes the upgrade will be complete and the output will be similar to:
NAME CLUSTER INITIALIZED API SERVER AVAILABLE REPLICAS READY UPDATED UNAVAILABLE AGE VERSION
kubeadmcontrolplane.controlplane.cluster.x-k8s.io/clusterclass-quickstart-XXXX clusterclass-quickstart true true 1 1 1 0 7m29s v1.22.0
NAME CLUSTER REPLICAS READY UPDATED UNAVAILABLE PHASE AGE VERSION
machinedeployment.cluster.x-k8s.io/clusterclass-quickstart-linux-workers-XXXX clusterclass-quickstart 1 1 1 0 Running 7m29s v1.22.0
When using a managed topology scaling of MachineDeployments, both up and down, should be done through the Cluster topology.
Assume we have created a CAPD cluster with ClusterClass and Kubernetes v1.23.3 (as documented in the Quick Start guide). Initially we should have a MachineDeployment with 3 replicas. Running
kubectl get machinedeployments
Will give us:
NAME CLUSTER REPLICAS READY UPDATED UNAVAILABLE PHASE AGE VERSION
machinedeployment.cluster.x-k8s.io/capi-quickstart-md-0-XXXX capi-quickstart 3 3 3 0 Running 21m v1.23.3
We can scale up or down this MachineDeployment through the Cluster object by changing the replicas field under /spec/topology/workers/machineDeployments/0/replicas
The 0
in the path refers to the position of the target MachineDeployment in the list of our Cluster topology. As we only have one MachineDeployment we're targeting the first item in the list under /spec/topology/workers/machineDeployments/
.
To change this value with a patch:
kubectl patch cluster capi-quickstart --type json --patch '[{"op": "replace", "path": "/spec/topology/workers/machineDeployments/0/replicas", "value": 1}]'
This patch will make the following changes on the Cluster yaml:
spec:
topology:
workers:
machineDeployments:
- class: default-worker
name: md-0
metadata: {}
+ replicas: 1
- replicas: 3
After a minute the MachineDeployment will have scaled down to 1 replica:
NAME CLUSTER REPLICAS READY UPDATED UNAVAILABLE PHASE AGE VERSION
capi-quickstart-md-0-XXXXX capi-quickstart 1 1 1 0 Running 25m v1.23.3
As well as scaling a MachineDeployment, Cluster operators can edit the labels and annotations applied to a running MachineDeployment using the Cluster topology as a single point of control.
MachineDeployments in a managed Cluster are defined in the Cluster's topology. Cluster operators can add a MachineDeployment to a living Cluster by adding it to the cluster.spec.topology.workers.machineDeployments
field.
Assume we have created a CAPD cluster with ClusterClass and Kubernetes v1.23.3 (as documented in the Quick Start guide). Initially we should have a single MachineDeployment with 3 replicas. Running
kubectl get machinedeployments
Will give us:
NAME CLUSTER REPLICAS READY UPDATED UNAVAILABLE PHASE AGE VERSION
machinedeployment.cluster.x-k8s.io/capi-quickstart-md-0-XXXX capi-quickstart 3 3 3 0 Running 21m v1.23.3
A new MachineDeployment can be added to the Cluster by adding a new MachineDeployment spec under /spec/topology/workers/machineDeployments/
. To do so we can patch our Cluster with:
kubectl patch cluster capi-quickstart --type json --patch '[{"op": "add", "path": "/spec/topology/workers/machineDeployments/-", "value": {"name": "second-deployment", "replicas": 1, "class": "default-worker"} }]'
This patch will make the below changes on the Cluster yaml:
spec:
topology:
workers:
machineDeployments:
- class: default-worker
metadata: {}
replicas: 3
name: md-0
+ - class: default-worker
+ metadata: {}
+ replicas: 1
+ name: second-deployment
After a minute to scale the new MachineDeployment we get:
NAME CLUSTER REPLICAS READY UPDATED UNAVAILABLE PHASE AGE VERSION
capi-quickstart-md-0-XXXX capi-quickstart 1 1 1 0 Running 39m v1.23.3
capi-quickstart-second-deployment-XXXX capi-quickstart 1 1 1 0 Running 99s v1.23.3
Our second deployment uses the same underlying MachineDeployment class default-worker
as our initial deployment. In this case they will both have exactly the same underlying machine templates. In order to modify the templates MachineDeployments are based on take a look at Changing a ClusterClass.
A similar process as that described here - removing the MachineDeployment from cluster.spec.topology.workers.machineDeployments
- can be used to delete a running MachineDeployment from an active Cluster.
When using a managed topology scaling of ControlPlane Machines, where the Cluster is using a topology that includes ControlPlane MachineInfrastructure, should be done through the Cluster topology.
This is done by changing the ControlPlane replicas field at /spec/topology/controlPlane/replica
in the Cluster object. The command is:
kubectl patch cluster capi-quickstart --type json --patch '[{"op": "replace", "path": "/spec/topology/controlPlane/replicas", "value": 1}]'
This patch will make the below changes on the Cluster yaml:
spec:
topology:
controlPlane:
metadata: {}
+ replicas: 1
- replicas: 3
As well as scaling a ControlPlane, Cluster operators can edit the labels and annotations applied to a running ControlPlane using the Cluster topology as a single point of control.
A ClusterClass can use variables and patches in order to allow flexible customization of Clusters derived from a ClusterClass. Variable definition allows two or more Cluster topologies derived from the same ClusterClass to have different specs, with the differences controlled by variables in the Cluster topology.
Assume we have created a CAPD cluster with ClusterClass and Kubernetes v1.23.3 (as documented in the Quick Start guide). Our Cluster has a variable etcdImageTag
as defined in the ClusterClass. The variable is not set on our Cluster. Some variables, depending on their definition in a ClusterClass, may need to be specified by the Cluster operator for every Cluster created using a given ClusterClass.
In order to specify the value of a variable all we have to do is set the value in the Cluster topology.
We can see the current unset variable with:
kubectl get cluster capi-quickstart -o jsonpath='{.spec.topology.variables[1]}'
Which will return something like:
{"name":"etcdImageTag","value":""}
In order to run a different version of etcd in new ControlPlane machines - the part of the spec this variable sets - change the value using the below patch:
kubectl patch cluster capi-quickstart --type json --patch '[{"op": "replace", "path": "/spec/topology/variables/1/value", "value": "3.5.0"}]'
Running the patch makes the following change to the Cluster yaml:
spec:
topology:
variables:
- name: imageRepository
value: registry.k8s.io
- name: etcdImageTag
value: ""
- name: coreDNSImageTag
+ value: "3.5.0"
- value: ""
Retrieving the variable value from the Cluster object, with kubectl get cluster capi-quickstart -o jsonpath='{.spec.topology.variables[1]}'
we can see:
{"name":"etcdImageTag","value":"3.5.0"}
Note: Changing the etcd version may have unintended impacts on a running Cluster. For safety the cluster should be reapplied after running the above variable patch.
To perform more significant changes using a Cluster as a single point of control, it may be necessary to change the ClusterClass that the Cluster is based on. This is done by changing the class referenced in /spec/topology/class
.
To read more about changing an underlying class please refer to ClusterClass rebase.
Users should always aim at ensuring the stability of the Cluster and of the applications hosted on it while
using spec.topology
as a single point of control for making changes to the objects that are part of the Cluster.
Following recommendation apply:
- If possible, avoid concurrent changes to control-plane and/or MachineDeployments to prevent excessive turnover on the underlying infrastructure or bottlenecks in the Cluster trying to move workloads from one machine to the other.
- Keep machine labels and annotation stable, because changing those values requires machines rollouts; also, please note that machine labels and annotation are not propagated to Kubernetes nodes; see metadata propagation.
- While upgrading a Cluster, if possible avoid any other concurrent change to the Cluster; please note that you can rely on version-aware patches to ensure the Cluster adapts to the new Kubernetes version in sync with the upgrade workflow.
For more details about how changes can affect a Cluster, please look at reference.
When applying concurrent changes to a Cluster, the topology controller will immediately act in order to reconcile to the desired state, and thus proxy all the required changes to the underlying objects which in turn take action, and this might require rolling out machines (create new, delete old).
As noted above, when executed at scale this might create excessive turnover on the underlying infrastructure or bottlenecks in the Cluster trying to move workloads from one machine to the other.
Additionally, in case of change of the Kubernetes version and other concurrent changes for Machines deployments this could lead to double rollout of the worker nodes:
- The first rollout triggered by the changes to the machine deployments immediately applied to the underlying objects (e.g change of labels).
- The second rollout triggered by the upgrade workflow changing the MachineDeployment version only after the control upgrade is completed (see upgrade a cluster above).
Please note that:
- Cluster API already implements strategies to ensure changes in a Cluster are executed in a safe way under most of the circumstances, including users occasionally not acting according to above best practices;
- The above-mentioned strategies are currently implemented on the abstraction controlling a single set of machines, the control-plane (KCP) or the MachineDeployment;
- In future Managed topologies could be improved by introducing strategies to ensure a higher safety across all abstraction controlling Machines in a Cluster, but this work is currently at its initial stage and user feedback could help in shaping out those improvements.
- Similarly, in future we might consider implementing strategies to controlling changes across many Clusters.
There are some special considerations for ClusterClass regarding Cluster API upgrades when the upgrade includes a bump of the apiVersion of infrastructure, bootstrap or control plane provider CRDs.
The recommended approach is to first upgrade Cluster API and then update the apiVersions in the ClusterClass references afterwards. By following above steps, there won't be any disruptions of the reconciliation as the Cluster topology controller is able to reconcile the Cluster even with the old apiVersions in the ClusterClass.
Note: The apiVersions in ClusterClass cannot be updated before Cluster API because the new apiVersions don't exist in the management cluster before the Cluster API upgrade.
In general the Cluster topology controller always uses exactly the versions of the CRDs referenced in the ClusterClass.
This means in the following example the Cluster topology controller will always use v1beta1
when reconciling/applying
patches for the infrastructure ref, even if the DockerClusterTemplate
already has a v1beta2
apiVersion.
apiVersion: cluster.x-k8s.io/v1beta1
kind: ClusterClass
metadata:
name: quick-start
namespace: default
spec:
infrastructure:
ref:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerClusterTemplate
...
When upgrading the apiVersions in references in the ClusterClass the corresponding patches have to be changed accordingly. This includes bumping the apiVersion in the patch selector and potentially updating the JSON patch to changes in the new apiVersion of the referenced CRD. The following example shows how to upgrade the ClusterClass in this case.
ClusterClass with the old apiVersion:
apiVersion: cluster.x-k8s.io/v1beta1
kind: ClusterClass
metadata:
name: quick-start
spec:
infrastructure:
ref:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerClusterTemplate
...
patches:
- name: lbImageRepository
definitions:
- selector:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerClusterTemplate
matchResources:
infrastructureCluster: true
jsonPatches:
- op: add
path: "/spec/template/spec/loadBalancer/imageRepository"
valueFrom:
variable: lbImageRepository
ClusterClass with the new apiVersion:
apiVersion: cluster.x-k8s.io/v1beta1
kind: ClusterClass
metadata:
name: quick-start
spec:
infrastructure:
ref:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2 # apiVersion updated
kind: DockerClusterTemplate
...
patches:
- name: lbImageRepository
definitions:
- selector:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2 # apiVersion updated
kind: DockerClusterTemplate
matchResources:
infrastructureCluster: true
jsonPatches:
- op: add
# Path has been updated, as in this example imageRepository has been renamed
# to imageRepo in v1beta2 of DockerClusterTemplate.
path: "/spec/template/spec/loadBalancer/imageRepo"
valueFrom:
variable: lbImageRepository
If external patches are used in the ClusterClass, it has to be ensured that all external patches support the new apiVersion before bumping apiVersions.