Is your feature request related to a problem? Please describe.
Related to #7782
Summary
The node-local load balancing (NLLB) Envoy Pod runs at priority 0. It should
default to the system-node-critical priority class, because the Envoy Pod is
the worker's load-balanced path to the control plane, and a priority of 0
causes it to be evicted/terminated before the workloads that depend on it.
Background
When NLLB is enabled with the EnvoyProxy type, k0s runs an Envoy static Pod on
each worker. Envoy proxies the worker's traffic to the Kubernetes API server (and
konnectivity server) over the loopback interface, e.g. [::1]:7443. Every other
Pod and the kubelet itself reach the control plane through this Envoy Pod.
The Envoy Pod is currently created with no priorityClassName, so its effective
priority is 0 (the lowest, same as ordinary workloads).
The problem
This matters in practice. The Envoy Pod runs at priority 0, yet it is the
worker's load-balanced path to the control plane. With graceful node shutdown
enabled (shutdownGracePeriod / shutdownGracePeriodCriticalPods via a worker
profile), the kubelet shutdown manager kills non-critical pods first and critical
pods last. Because the Envoy Pod is priority 0, it is killed in the first phase,
severing the worker's path to the API server ([::1]:7443) before the remaining
pods can drain or report status:
Failed to update status for pod ...: Patch "https://[::1]:7443/...": unexpected EOF
... dial tcp [::1]:7443: connect: connection refused
The same priority-0 exposure also applies to node-pressure eviction: under
resource pressure the kubelet can evict the Envoy Pod ahead of higher-priority
workloads, cutting off the worker's only path to the control plane.
Describe the solution you would like
No response
Describe alternatives you've considered
No response
Additional context
No response
Is your feature request related to a problem? Please describe.
Related to #7782
Summary
The node-local load balancing (NLLB) Envoy Pod runs at priority
0. It shoulddefault to the
system-node-criticalpriority class, because the Envoy Pod isthe worker's load-balanced path to the control plane, and a priority of
0causes it to be evicted/terminated before the workloads that depend on it.
Background
When NLLB is enabled with the
EnvoyProxytype, k0s runs an Envoy static Pod oneach worker. Envoy proxies the worker's traffic to the Kubernetes API server (and
konnectivity server) over the loopback interface, e.g.
[::1]:7443. Every otherPod and the kubelet itself reach the control plane through this Envoy Pod.
The Envoy Pod is currently created with no
priorityClassName, so its effectivepriority is
0(the lowest, same as ordinary workloads).The problem
This matters in practice. The Envoy Pod runs at priority
0, yet it is theworker's load-balanced path to the control plane. With graceful node shutdown
enabled (
shutdownGracePeriod/shutdownGracePeriodCriticalPodsvia a workerprofile), the kubelet shutdown manager kills non-critical pods first and critical
pods last. Because the Envoy Pod is priority
0, it is killed in the first phase,severing the worker's path to the API server (
[::1]:7443) before the remainingpods can drain or report status:
The same priority-
0exposure also applies to node-pressure eviction: underresource pressure the kubelet can evict the Envoy Pod ahead of higher-priority
workloads, cutting off the worker's only path to the control plane.
Describe the solution you would like
No response
Describe alternatives you've considered
No response
Additional context
No response