Skip to content

Commit cef1764

Browse files
committed
KEP-3545: graduate to GA
Signed-off-by: pprokop <[email protected]>
1 parent e85182b commit cef1764

File tree

3 files changed

+59
-19
lines changed

3 files changed

+59
-19
lines changed

Diff for: keps/prod-readiness/sig-node/3545.yaml

+2
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,5 @@ alpha:
33
approver: "@johnbelamaric"
44
beta:
55
approver: "@johnbelamaric"
6+
stable:
7+
approver: "@johnbelamaric"

Diff for: keps/sig-node/3545-improved-multi-numa-alignment/README.md

+53-16
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,9 @@ tags, and then generate with `hack/update-toc.sh`.
4949
- [Scalability](#scalability)
5050
- [Troubleshooting](#troubleshooting)
5151
- [Implementation History](#implementation-history)
52+
- [Drawbacks](#drawbacks)
53+
- [Alternatives](#alternatives)
54+
- [Infrastructure Needed (Optional)](#infrastructure-needed-optional)
5255
<!-- /toc -->
5356

5457
## Release Signoff Checklist
@@ -74,10 +77,10 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
7477
- [x] (R) Design details are appropriately documented
7578
- [x] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
7679
- [ ] e2e Tests for all Beta API Operations (endpoints)
77-
- [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
80+
- [ ] (R) Ensure GA e2e tests meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
7881
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
7982
- [x] (R) Graduation criteria is in place
80-
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
83+
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
8184
- [x] (R) Production readiness review completed
8285
- [ ] (R) Production readiness review approved
8386
- [x] "Implementation History" section is up-to-date for milestone
@@ -124,7 +127,7 @@ This limitation surfaces in multi-socket, as well as single-socket multi NUMA sy
124127

125128

126129
### Proposed Change
127-
130+
128131
We propose to
129132
- add a new flag in Kubelet called `TopologyManagerPolicyOptions` in the kubelet config or command line argument called `topology-manager-policy-options` which allows the user to specify the Topology Manager policy option.
130133
- add a new topology manager option called `prefer-closest-numa-nodes`; if present, this option will enable further refinements of the existing `restricted` and `best-effort` policies, this option has no effect for `none` and `single-numa-node` policies.
@@ -138,11 +141,7 @@ To summarize properties of `prefer-closest-numa-nodes` policy:
138141
* Choose `NUMANodeAffinity` bitmask which is the narrowest and if the bitmask width is the same prefer the bitmask with smaller average distance.
139142

140143
When `prefer-closest-numa-nodes` policy is enabled, we need to retrieve information regarding distances between NUMA nodes.
141-
Right now Topology manager discovers Node layout using [CAdvisor API](https://github.com/google/cadvisor/blob/master/info/v1/machine.go#L40).
142-
143-
We will need to extend the `MachineInfo` struct with a `Distances` field which will describe the distance between a given NUMA node and other NUMA nodes in the system.
144-
This is already implemented in `cadvisor` by this [patch](https://github.com/google/cadvisor/pull/3179) but it is not yet present in any of the released versions.
145-
Until a new release of `cadvisor` includes this patch (and it gets vendored into the `kubelet`) we will need to replicate this logic in the `kubelet` code itself.
144+
Right now Topology manager discovers Node layout using [CAdvisor API](https://github.com/google/cadvisor/blob/master/info/v1/machine.go#L40), which already exposes NUMA distance between nodes since version `v0.46.0`.
146145

147146
### Implementation strategy
148147

@@ -155,7 +154,7 @@ Until a new release of `cadvisor` includes this patch (and it gets vendored into
155154
- When `TopologyManager` is being created it discovers distances between NUMA nodes and stores them inside `manager` struct. This is temporary until `distance` information lands in `cadvisor`.
156155
- Pass `TopologyManagerPolicyOptions` to best-effort and restricted policy. When this is specified best-hint is picked based on average distance between NUMA nodes. This would require modification to `compareHints` function to change how the best hint is calculated:
157156

158-
```go
157+
```go
159158

160159
// NUMADistance is a matrix representing distances between NUMA nodes
161160
type NUMADistance [][]uint64
@@ -178,16 +177,16 @@ func compareHints(bestNonPreferredAffinityCount int, current *TopologyHint, cand
178177
if current.Preferred && candidate.Preferred {
179178
if candidate.NUMANodeAffinity.IsNarrowerThan(current.NUMANodeAffinity) {
180179
return candidate
181-
}
180+
}
182181
if policyOpts.PreferClosestNuma && candidate.NUMANodeAffinity.IsEqual(current.NUMANodeAffinity) {
183182
candidateDistance := policyOpts.Distances.CalculateAvgDistanceFor(candidate)
184183
currentDistance := policyOpts.Distances.CalculateAvgDistanceFor(current)
185184
// candidate avg distance is lower
186185
if candidateDistance < currentDistance {
187186
return candidate
188-
}
187+
}
189188

190-
return current
189+
return current
191190
}
192191
}
193192

@@ -253,6 +252,7 @@ to implement this enhancement.
253252

254253
- `k8s.io/kubernetes/pkg/kubelet/cm/topologymanager`: `09-23-2022` - `92.4`
255254
- `k8s.io/kubernetes/pkg/kubelet/cm/topologymanager`: `06-12-2023` - `93.2`
255+
- `k8s.io/kubernetes/pkg/kubelet/cm/topologymanager`: `09-30-2024` - `92.4`
256256

257257
##### Integration tests
258258

@@ -289,7 +289,7 @@ These cases will be added in the existing e2e tests:
289289

290290
In 1.26 we are releasing this feature to Alpha. We propose the following management of TopologyManager policy options graduation:
291291

292-
- `TopologyManagerPolicyOptions` for enabling/disabling the entire feature. As this is an alpha feature, this feature gate would be disabled by default.
292+
- `TopologyManagerPolicyOptions` for enabling/disabling the entire feature. As this is an alpha feature, this feature gate would be disabled by default.
293293
Explicitly enabling `TopologyManagerPolicyOptions` feature gate provides us the ability to supply `TopologyManagerPolicyOptions` or `topology-manager-policy-options` flag in Kubelet.
294294

295295
- `TopologyManagerPolicyAlphaOptions` is not enabled by default. Topology Manager alpha options (only one as of 1.26), are hidden by default
@@ -309,6 +309,10 @@ In 1.28 this feature is being promoted to Beta. We propose following changes to
309309
- `TopologyManagerPolicyBetaOptions` feature flag for enabling/disabling beta options will be enabled by default.
310310
- `prefer-closest-numa-nodes` will be moved to Beta options.
311311

312+
In 1.32 this feature is being promoted to Beta. We propose following changes to TopologyManager policy options default visibility:
313+
314+
- `prefer-closest-numa-nodes` will be moved to stable options.
315+
312316
The graduation Criteria of options is described below:
313317

314318
#### Graduation of Options to `Beta-quality` (non-hidden)
@@ -342,7 +346,7 @@ No changes needed
342346
- Components depending on the feature gate: kubelet
343347
- [x] Change the kubelet configuration to set a `TopologyManager` policy to `restricted` or `best-effort` and a `TopologyManagerPolicyOptions` to `prefer-closest-numa-nodes`
344348
- Will enabling / disabling the feature require downtime of the control
345-
plane?
349+
plane?
346350
No.
347351
- Will enabling / disabling the feature require downtime or reprovisioning
348352
of a node?
@@ -455,11 +459,11 @@ N/A.
455459

456460
There are 2 scenarios where Kubelet may fail to start due to using this feature:
457461

458-
- Bad policy option name or using policy option without enabling appropriate feature flag. we are emitting appropriate error message for this case,
462+
- Bad policy option name or using policy option without enabling appropriate feature flag. we are emitting appropriate error message for this case,
459463
Kubelet will fail to start and print error message what happened. To recover one just have to provide fix policy option name or disable/enable feature flags.
460464

461465
- Cadvisor is not exposing distances for NUMA domains. In this case Kubelet will fail with `error getting NUMA distances from cadvisor` message.
462-
Reading NUMA distances is only performed when `prefer-clostest-numa-nodes` option is specified.
466+
Reading NUMA distances is only performed when `prefer-closest-numa-nodes` option is specified.
463467
To recover one has to either disable `TopologyManagerPolicyOptions` feature-flag or stop using `prefer-closest-numa-nodes` option.
464468

465469
###### What steps should be taken if SLOs are not being met to determine the problem?
@@ -470,3 +474,36 @@ N/A.
470474

471475
- 2021-09-26: KEP created
472476
- 2023-06-12: KEP updated for Beta release
477+
- 2024-09-30: KEP updated for Stable release
478+
479+
480+
## Drawbacks
481+
482+
<!--
483+
Why should this KEP _not_ be implemented?
484+
-->
485+
Adds complexity to Topology Manager.
486+
487+
## Alternatives
488+
489+
<!--
490+
What other approaches did you consider, and why did you rule them out? These do
491+
not need to be as detailed as the proposal, but should include enough
492+
information to express the idea and why it was not acceptable.
493+
-->
494+
495+
Adding new Topology Manager policy similar to `best-effort` which also takes NUMA distance into consideration.
496+
This approach was ruled out as NUMA distance can be an important factor not only for `best-effort` but also `restricted` policy.
497+
498+
So introducing a mechanism similar to CPUManagerPolicyOptions was an already familiar concept that could influence logic of multiple policies.
499+
500+
## Infrastructure Needed (Optional)
501+
502+
<!--
503+
Use this section if you need things from the project/SIG. Examples include a
504+
new subproject, repos requested, or GitHub details. Listing these here allows a
505+
SIG to get the process for these resources started right away.
506+
-->
507+
508+
To be able to do e2e testing it would be required that CI machines with at least 4 NUMA nodes exist to be able to use `prefer-closest-numa-nodes` policy option properly.
509+
The [effort to add machines for sig-node e2e tests](https://github.com/kubernetes/k8s.io/issues/7339) is still open and it shouldn't block this KEP from graduating to GA.

Diff for: keps/sig-node/3545-improved-multi-numa-alignment/kep.yaml

+4-3
Original file line numberDiff line numberDiff line change
@@ -10,22 +10,23 @@ creation-date: "2022-09-26"
1010
reviewers: []
1111
approvers:
1212
- "@sig-node-tech-leads"
13+
last-updated: "2024-09-30"
1314
see-also: []
1415
replaces: []
1516

1617
# The target maturity stage in the current dev cycle for this KEP.
17-
stage: beta
18+
stage: stable
1819

1920
# The most recent milestone for which work toward delivery of this KEP has been
2021
# done. This can be the current (upcoming) milestone, if it is being actively
2122
# worked on.
22-
latest-milestone: "v1.28"
23+
latest-milestone: "v1.32"
2324

2425
# The milestone at which this feature was, or is targeted to be, at each stage.
2526
milestone:
2627
alpha: "v1.26"
2728
beta: "v1.28"
28-
stable: "v1.30"
29+
stable: "v1.32"
2930

3031
# The following PRR answers are required at alpha release
3132
# List the feature gate name and the components for which it must be enabled

0 commit comments

Comments
 (0)