Cluster Scaling and load testing

Cluster Scaling

We will initially be starting out with 3 Control nodes and 5 worker nodes. From here we will scale up at intervals of 10 worker nodes per iteration until we reach the maximum number of available nodes.

Cluster Scaling Considerations

Based on Red Hat documentation, the control plane should be able to handle 500+ nodes. The first limitation the cluster will hit is network throughput. When adding nodes make sure to do so 10 at a time and run through the scaling tests after they've been added to the cluster. Once you start seeing pod ready latency reach an unacceptable level or memory/cpu usage above 80% consider adding infra nodes to take some load off of the control plane.

Interval Testing

After each interval reports that the nodes are ready we will use kube-burner. The goal here is two-fold. First we want to make sure that we can schedule pods on each node, and second that we can max out node capacity without needing to add infra nodes to support the three controller nodes.

Key Performance Indicators

Kubelet and CRI-O CPU and Memory Usage
API Server and ETCD CPU and Memory Usage
Control Node and Worker Node CPU and Memory Usage
API Response Latency (99th percentile)
ETCD No leader elections
Pod Latencies, Scheduling and Ready
Cluster Operator Health

Tests

These tests are based on examples provided by the kube-burner project.

kubelet-density-heavy:

Verifies maximum pods per node. It creates a client/server application consisting of a basic application which performs queries in a pod running PostgreSQL and uses a k8s service to communicate with it.

Worker Nodes	Pods	Containers	Projects
5	90	180	9
15	270	540	26
25	450	900	43
35	630	1260	60
45	810	1620	77
55	990	1980	94
65	1170	2340	111

18 pods per node, 2 containers per pod

1.7 projects per node, rounded up

cluster-density

This workload creates resources such as builds and routes to stress the OpenShift control plane.

Worker Nodes	Projects	ConfigMaps	ClusterIP Services	Secrets	Deployments/ReplicaSets
5	3	45	50	55	8
15	9	135	150	165	23
25	15	225	250	275	38
35	21	315	350	385	53
45	27	405	450	495	68
55	33	495	550	605	83
65	39	585	650	715	98

0.6 Projects per worker node

9 configMaps per worker node

10 clusterIPServices per worker node

11 secrets per worker node

1.5 deployments and replicaSets per worker node

Running the tests

You'll need to download the binaries from the kube-burner github

wget https://github.com/cloud-bulldozer/kube-burner/releases/download/v0.16.1/kube-burner-0.16.1-Linux-x86_64.tar.gz

Extract the binary

tar -xvzf kube-burner-0.16.1-Linux-x86_64.tar.gz

Set required environment variables

export ES_ENDPOINT="https://endpoint.es.com" # Endpoint for elastic server. Leave blank if none
export JOB_ITERATIONS=8 # Match to corresponding number of projects to available worker nodes (See tables above)

Run test

kube-burner init -c <test-config.yaml> # Make sure to capture uuid from output. Can be found as kube-burner-uuid label on all resources created

Remove resources created

kube-burner destroy --uuid=<captured-uuid>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cluster_scaling_and_load_testing.md

cluster_scaling_and_load_testing.md

Cluster Scaling and load testing

Cluster Scaling

Cluster Scaling Considerations

Interval Testing

Key Performance Indicators

Tests

Running the tests

Worker Nodes	Projects	ConfigMaps	ClusterIP Services	Secrets	Deployments/ReplicaSets
5	3	45	50	55	8
15	9	135	150	165	23
25	15	225	250	275	38
35	21	315	350	385	53
45	27	405	450	495	68
55	33	495	550	605	83
65	39	585	650	715	98

Worker Nodes	Projects	ConfigMaps	ClusterIP Services	Secrets	Deployments/ReplicaSets
5	3	45	50	55	8
15	9	135	150	165	23
25	15	225	250	275	38
35	21	315	350	385	53
45	27	405	450	495	68
55	33	495	550	605	83
65	39	585	650	715	98

Files

cluster_scaling_and_load_testing.md

Latest commit

History

cluster_scaling_and_load_testing.md

File metadata and controls

Cluster Scaling and load testing

Cluster Scaling

Cluster Scaling Considerations

Interval Testing

Key Performance Indicators

Tests

Running the tests

Worker Nodes	Projects	ConfigMaps	ClusterIP Services	Secrets	Deployments/ReplicaSets
5	3	45	50	55	8
15	9	135	150	165	23
25	15	225	250	275	38
35	21	315	350	385	53
45	27	405	450	495	68
55	33	495	550	605	83
65	39	585	650	715	98