Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VPA: Implement in-place updates support #7673

Open
wants to merge 22 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
dcea4ca
VPA: Add UpdateModeInPlaceOrRecreate to types
jkyros Mar 14, 2024
a4b91de
VPA: Allow admission-controller to validate in-place spec
maxcao13 Jan 7, 2025
0f5eb1d
VPA: Stuck in-place resizes still require eviction
jkyros Mar 14, 2024
3451011
VPA: Make eviction restriction in-place aware
jkyros Mar 14, 2024
c5549fb
VPA: Updater logic allows in-place scaling
jkyros Mar 14, 2024
88d64c3
VPA: Add metrics gauges for in-place updates
jkyros Mar 14, 2024
ec7863f
VPA: Update mocks to accommodate in-place VPA changes
jkyros Mar 14, 2024
84588e4
VPA: hack unit tests to account for in-place
jkyros Mar 14, 2024
bfb0321
VPA: Add e2e tests for in-place scaling
jkyros Mar 23, 2024
d6a8ab4
VPA: allow rule-breaking updates if disruptionless
jkyros Mar 23, 2024
94612ce
VPA: only allow in-place if explicitly set
jkyros Mar 21, 2024
cca0b60
VPA: Allow VPA updater to actuate recommendations in-place
maxcao13 Dec 20, 2024
8977802
VPA: Enable InPlacePodVerticalScaling feature flag on e2e
maxcao13 Jan 7, 2025
08ba2fb
VPA: Add in-place actuation and admission-controller e2e tests
maxcao13 Jan 7, 2025
6d8be83
VPA: fix logs and cleanup TODOs according to review
maxcao13 Jan 15, 2025
981ea57
VPA: Revert changes that related to disruption/disruptionless changes
maxcao13 Feb 24, 2025
acd5a15
VPA: Fix in-place actuation tests to align with updated AEP
maxcao13 Feb 25, 2025
742d694
VPA: Add features gates; add InPlaceVerticalScaling feature gate
maxcao13 Mar 7, 2025
ed61b77
VPA: update updater unit tests
maxcao13 Mar 11, 2025
aa8b3ff
VPA: Revert using containerStatus resources to calculate update priority
maxcao13 Mar 11, 2025
3825298
VPA: updated in-place e2e tests to account for feature gate
maxcao13 Mar 11, 2025
8044ed7
VPA: add InPlaceVerticalScaling feature gate to admission-controller
maxcao13 Mar 11, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -47,15 +47,3 @@ spec:
- name: tls-certs
secret:
secretName: vpa-tls-certs
---
apiVersion: v1
kind: Service
metadata:
name: vpa-webhook
namespace: kube-system
spec:
ports:
- port: 443
targetPort: 8000
selector:
app: vpa-admission-controller
11 changes: 11 additions & 0 deletions vertical-pod-autoscaler/deploy/admission-controller-service.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: v1
kind: Service
metadata:
name: vpa-webhook
namespace: kube-system
spec:
ports:
- port: 443
targetPort: 8000
selector:
app: vpa-admission-controller
26 changes: 26 additions & 0 deletions vertical-pod-autoscaler/deploy/vpa-rbac.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,32 @@ rules:
- create
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:vpa-updater-in-place
rules:
- apiGroups:
- ""
resources:
- pods/resize
- pods
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why we still need patch on the pod itself, isn't pods/resize sufficient?

Copy link
Member Author

@maxcao13 maxcao13 Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be sufficient for resizing, but in order to patch the annotations onto the pod itself, I need that rule as well. Previously, when the admission-controller does it's own annotation (vpaObservedContainers), it can just include the annotation in the webhook pod mangling, but the updater can't do that on its own.

https://github.com/maxcao13/autoscaler/blob/maxcao13-inplace/vertical-pod-autoscaler/pkg/updater/eviction/pods_eviction_restriction.go#L544

Whether we want to use this new annotation or not is a different story though. It's purely for cosmetic reasons as noted in this comment: #7673 (comment), but the vpaObservedContainers annotation is actually used in GetUpdatePriority. Curious what people think.

verbs:
- patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:vpa-updater-in-place-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:vpa-updater-in-place
subjects:
- kind: ServiceAccount
name: vpa-updater
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:metrics-reader
Expand Down
1 change: 1 addition & 0 deletions vertical-pod-autoscaler/deploy/vpa-v1-crd-gen.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -458,6 +458,7 @@ spec:
- "Off"
- Initial
- Recreate
- InPlaceOrRecreate
- Auto
type: string
type: object
Expand Down
2 changes: 2 additions & 0 deletions vertical-pod-autoscaler/docs/flags.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ This document is auto-generated from the flag definitions in the VPA admission-c
| `--address` | ":8944" | The address to expose Prometheus metrics. |
| `--alsologtostderr` | | log to standard error as well as files (no effect when -logtostderr=true) |
| `--client-ca-file` | "/etc/tls-certs/caCert.pem" | Path to CA PEM file. |
| `--feature-gates` | | A set of key=value pairs that describe feature gates for alpha/experimental features. Options are: |
| `--ignored-vpa-object-namespaces` | | A comma-separated list of namespaces to ignore when searching for VPA objects. Leave empty to avoid ignoring any namespaces. These namespaces will not be cleaned by the garbage collector. |
| `--kube-api-burst` | 10 | QPS burst limit when making requests to Kubernetes apiserver |
| `--kube-api-qps` | 5 | QPS limit when making requests to Kubernetes apiserver |
Expand Down Expand Up @@ -135,6 +136,7 @@ This document is auto-generated from the flag definitions in the VPA updater cod
| `--eviction-rate-burst` | 1 | Burst of pods that can be evicted. |
| `--eviction-rate-limit` | | Number of pods that can be evicted per seconds. A rate limit set to 0 or -1 will disable |
| `--eviction-tolerance` | 0.5 | Fraction of replica count that can be evicted for update, if more than one pod can be evicted. |
| `--feature-gates` | | A set of key=value pairs that describe feature gates for alpha/experimental features. Options are: |
| `--ignored-vpa-object-namespaces` | | A comma-separated list of namespaces to ignore when searching for VPA objects. Leave empty to avoid ignoring any namespaces. These namespaces will not be cleaned by the garbage collector. |
| `--in-recommendation-bounds-eviction-lifetime-threshold` | 12h0m0s | Pods that live for at least that long can be evicted even if their request is within the [MinRecommended...MaxRecommended] range |
| `--kube-api-burst` | 10 | QPS burst limit when making requests to Kubernetes apiserver |
Expand Down
2 changes: 1 addition & 1 deletion vertical-pod-autoscaler/e2e/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ require (
k8s.io/apimachinery v0.32.0
k8s.io/autoscaler/vertical-pod-autoscaler v1.2.1
k8s.io/client-go v0.32.0
k8s.io/component-base v0.32.0
k8s.io/component-base v0.32.2
k8s.io/klog/v2 v2.130.1
k8s.io/kubernetes v1.32.0
k8s.io/pod-security-admission v0.32.0
Expand Down
23 changes: 5 additions & 18 deletions vertical-pod-autoscaler/e2e/go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -58,8 +58,6 @@ github.com/docker/go-units v0.5.0 h1:69rxXcBk27SvSaaxTtLh/8llcHD8vYHT7WSdRZ/jvr4
github.com/docker/go-units v0.5.0/go.mod h1:fgPhTUdO+D/Jk86RDLlptpiXQzgHJF7gydDDbaIK4Dk=
github.com/dustin/go-humanize v1.0.1 h1:GzkhY7T5VNhEkwH0PVJgjz+fX1rhBrR7pRT3mDkpeCY=
github.com/dustin/go-humanize v1.0.1/go.mod h1:Mu1zIs6XwVuF/gI1OepvI0qD18qycQx+mFykh5fBlto=
github.com/emicklei/go-restful/v3 v3.11.0 h1:rAQeMHw1c7zTmncogyy8VvRZwtkmkZ4FxERmMY4rD+g=
github.com/emicklei/go-restful/v3 v3.11.0/go.mod h1:6n3XBCmQQb25CM2LCACGz8ukIrRry+4bhvbpWn3mrbc=
github.com/emicklei/go-restful/v3 v3.12.1 h1:PJMDIM/ak7btuL8Ex0iYET9hxM3CI2sjZtzpL63nKAU=
github.com/emicklei/go-restful/v3 v3.12.1/go.mod h1:6n3XBCmQQb25CM2LCACGz8ukIrRry+4bhvbpWn3mrbc=
github.com/euank/go-kmsg-parser v2.0.0+incompatible h1:cHD53+PLQuuQyLZeriD1V/esuG4MuU0Pjs5y6iknohY=
Expand Down Expand Up @@ -94,6 +92,8 @@ github.com/gogo/protobuf v1.3.2 h1:Ov1cvc58UF3b5XjBnZv7+opcTcQFZebYjWzi34vdm4Q=
github.com/gogo/protobuf v1.3.2/go.mod h1:P1XiOD3dCwIKUDQYPy72D8LYyHL2YPYrpS2s69NZV8Q=
github.com/golang-jwt/jwt/v4 v4.5.0 h1:7cYmW1XlMY7h7ii7UhUyChSgS5wUJEnm9uZVTGqOWzg=
github.com/golang-jwt/jwt/v4 v4.5.0/go.mod h1:m21LjoU+eqJr34lmDMbreY2eSTRJ1cv77w39/MY0Ch0=
github.com/golang/mock v1.6.0 h1:ErTB+efbowRARo13NNdxyJji2egdxLGQhRaY+DUumQc=
github.com/golang/mock v1.6.0/go.mod h1:p6yTPP+5HYm5mzsMV8JkE6ZKdX+/wYM6Hr+LicevLPs=
github.com/golang/protobuf v1.5.4 h1:i7eJL8qZTpSEXOPTxNKhASYpMn+8e5Q6AdndVa1dWek=
github.com/golang/protobuf v1.5.4/go.mod h1:lnTiLA8Wa4RWRcIUkrtSVa5nRhsEGBg48fD6rSs7xps=
github.com/google/btree v1.0.1 h1:gK4Kx5IaGY9CD5sPJ36FHiBJ6ZXl0kilRiiCj+jdYp4=
Expand Down Expand Up @@ -197,8 +197,8 @@ github.com/prometheus/common v0.61.0 h1:3gv/GThfX0cV2lpO7gkTUwZru38mxevy90Bj8YFS
github.com/prometheus/common v0.61.0/go.mod h1:zr29OCN/2BsJRaFwG8QOBr41D6kkchKbpeNH7pAjb/s=
github.com/prometheus/procfs v0.15.1 h1:YagwOFzUgYfKKHX6Dr+sHT7km/hxC76UB0learggepc=
github.com/prometheus/procfs v0.15.1/go.mod h1:fB45yRUv8NstnjriLhBQLuOUt+WW4BsoGhij/e3PBqk=
github.com/rogpeppe/go-internal v1.12.0 h1:exVL4IDcn6na9z1rAb56Vxr+CgyK3nn3O+epU5NdKM8=
github.com/rogpeppe/go-internal v1.12.0/go.mod h1:E+RYuTGaKKdloAfM02xzb0FW3Paa99yedzYV+kq4uf4=
github.com/rogpeppe/go-internal v1.13.1 h1:KvO1DLK/DRN07sQ1LQKScxyZJuNnedQ5/wKSR38lUII=
github.com/rogpeppe/go-internal v1.13.1/go.mod h1:uMEvuHeurkdAXX61udpOXGD/AzZDWNMNyH2VO9fmH0o=
github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/sirupsen/logrus v1.9.3 h1:dueUQJ1C2q9oE3F7wvmSGAaVtTmUizReu6fjN8uqzbQ=
github.com/sirupsen/logrus v1.9.3/go.mod h1:naHLuLoDiP4jHNo9R0sCBMtWGeIprob74mVsIT4qYEQ=
Expand Down Expand Up @@ -248,33 +248,24 @@ go.etcd.io/etcd/server/v3 v3.5.16 h1:d0/SAdJ3vVsZvF8IFVb1k8zqMZ+heGcNfft71ul9GWE
go.etcd.io/etcd/server/v3 v3.5.16/go.mod h1:ynhyZZpdDp1Gq49jkUg5mfkDWZwXnn3eIqCqtJnrD/s=
go.opentelemetry.io/auto/sdk v1.1.0 h1:cH53jehLUN6UFLY71z+NDOiNJqDdPRaXzTel0sJySYA=
go.opentelemetry.io/auto/sdk v1.1.0/go.mod h1:3wSPjt5PWp2RhlCcmmOial7AvC4DQqZb7a7wCow3W8A=
go.opentelemetry.io/contrib/instrumentation/github.com/emicklei/go-restful/otelrestful v0.42.0 h1:Z6SbqeRZAl2OczfkFOqLx1BeYBDYehNjEnqluD7581Y=
go.opentelemetry.io/contrib/instrumentation/github.com/emicklei/go-restful/otelrestful v0.42.0/go.mod h1:XiglO+8SPMqM3Mqh5/rtxR1VHc63o8tb38QrU6tm4mU=
go.opentelemetry.io/contrib/instrumentation/github.com/emicklei/go-restful/otelrestful v0.58.0 h1:jxGjjCJtwlI358D9adKjFFd4OcVOB0PGj12+uhHipAs=
go.opentelemetry.io/contrib/instrumentation/github.com/emicklei/go-restful/otelrestful v0.58.0/go.mod h1:6QoeBLOuf25Vw7HK5KK5ePHAWjVaNFDsqUXQAxUVwWI=
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.53.0 h1:9G6E0TXzGFVfTnawRzrPl83iHOAV7L8NJiR8RSGYV1g=
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.53.0/go.mod h1:azvtTADFQJA8mX80jIH/akaE7h+dbm/sVuaHqN13w74=
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.53.0 h1:4K4tsIXefpVJtvA/8srF4V4y0akAoPHkIslgAkjixJA=
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.53.0/go.mod h1:jjdQuTGVsXV4vSs+CJ2qYDeDPf9yIJV23qlIzBm73Vg=
go.opentelemetry.io/contrib/propagators/b3 v1.17.0 h1:ImOVvHnku8jijXqkwCSyYKRDt2YrnGXD4BbhcpfbfJo=
go.opentelemetry.io/contrib/propagators/b3 v1.17.0/go.mod h1:IkfUfMpKWmynvvE0264trz0sf32NRTZL4nuAN9AbWRc=
go.opentelemetry.io/contrib/propagators/b3 v1.33.0 h1:ig/IsHyyoQ1F1d6FUDIIW5oYpsuTVtN16AyGOgdjAHQ=
go.opentelemetry.io/otel v1.28.0 h1:/SqNcYk+idO0CxKEUOtKQClMK/MimZihKYMruSMViUo=
go.opentelemetry.io/otel v1.28.0/go.mod h1:q68ijF8Fc8CnMHKyzqL6akLO46ePnjkgfIMIjUIX9z4=
go.opentelemetry.io/contrib/propagators/b3 v1.33.0/go.mod h1:EsVYoNy+Eol5znb6wwN3XQTILyjl040gUpEnUSNZfsk=
go.opentelemetry.io/otel v1.33.0 h1:/FerN9bax5LoK51X/sI0SVYrjSE0/yUL7DpxW4K3FWw=
go.opentelemetry.io/otel v1.33.0/go.mod h1:SUUkR6csvUQl+yjReHu5uM3EtVV7MBm5FHKRlNx4I8I=
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.28.0 h1:3Q/xZUyC1BBkualc9ROb4G8qkH90LXEIICcs5zv1OYY=
go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.28.0/go.mod h1:s75jGIWA9OfCMzF0xr+ZgfrB5FEbbV7UuYo32ahUiFI=
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.27.0 h1:qFffATk0X+HD+f1Z8lswGiOQYKHRlzfmdJm0wEaVrFA=
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.27.0/go.mod h1:MOiCmryaYtc+V0Ei+Tx9o5S1ZjA7kzLucuVuyzBZloQ=
go.opentelemetry.io/otel/metric v1.28.0 h1:f0HGvSl1KRAU1DLgLGFjrwVyismPlnuU6JD6bOeuA5Q=
go.opentelemetry.io/otel/metric v1.28.0/go.mod h1:Fb1eVBFZmLVTMb6PPohq3TO9IIhUisDsbJoL/+uQW4s=
go.opentelemetry.io/otel/metric v1.33.0 h1:r+JOocAyeRVXD8lZpjdQjzMadVZp2M4WmQ+5WtEnklQ=
go.opentelemetry.io/otel/metric v1.33.0/go.mod h1:L9+Fyctbp6HFTddIxClbQkjtubW6O9QS3Ann/M82u6M=
go.opentelemetry.io/otel/sdk v1.28.0 h1:b9d7hIry8yZsgtbmM0DKyPWMMUMlK9NEKuIG4aBqWyE=
go.opentelemetry.io/otel/sdk v1.28.0/go.mod h1:oYj7ClPUA7Iw3m+r7GeEjz0qckQRJK2B8zjcZEfu7Pg=
go.opentelemetry.io/otel/trace v1.28.0 h1:GhQ9cUuQGmNDd5BTCP2dAvv75RdMxEfTmYejp+lkx9g=
go.opentelemetry.io/otel/trace v1.28.0/go.mod h1:jPyXzNPg6da9+38HEwElrQiHlVMTnVfM3/yv2OlIHaI=
go.opentelemetry.io/otel/trace v1.33.0 h1:cCJuF7LRjUFso9LPnEAHJDB2pqzp+hbO8eu1qqW2d/s=
go.opentelemetry.io/otel/trace v1.33.0/go.mod h1:uIcdVUZMpTAmz0tI1z04GoVSezK37CbGV4fr1f2nBck=
go.opentelemetry.io/proto/otlp v1.3.1 h1:TrMUixzpM0yuc/znrFTP9MMRh8trP93mkCiDVeXrui0=
Expand All @@ -288,8 +279,6 @@ go.uber.org/zap v1.27.0/go.mod h1:GB2qFLM7cTU87MWRP2mPIjqfIDnGu+VIO4V/SdhGo2E=
golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2/go.mod h1:djNgcEr1/C05ACkg1iLfiJU5Ep61QUkGW8qpdssI0+w=
golang.org/x/crypto v0.0.0-20191011191535-87dc89f01550/go.mod h1:yigFU9vqHzYiE8UmvKecakEJjdnWj3jj499lnFckfCI=
golang.org/x/crypto v0.0.0-20200622213623-75b288015ac9/go.mod h1:LzIPMQfyMNhhGPhUkYOs5KpL4U8rLKemX1yGLhDgUto=
golang.org/x/crypto v0.30.0 h1:RwoQn3GkWiMkzlX562cLB7OxWvjH1L8xutO2WoJcRoY=
golang.org/x/crypto v0.30.0/go.mod h1:kDsLvtWBEx7MV9tJOj9bnXsPbxwJQ6csT/x4KIN4Ssk=
golang.org/x/crypto v0.31.0 h1:ihbySMvVjLAeSH1IbfcRTkD/iNscyz8rGzjF/E5hV6U=
golang.org/x/crypto v0.31.0/go.mod h1:kDsLvtWBEx7MV9tJOj9bnXsPbxwJQ6csT/x4KIN4Ssk=
golang.org/x/exp v0.0.0-20240719175910-8a7402abbf56 h1:2dVuKD2vS7b0QIHQbpyTISPd0LeHDbnYEryqj5Q1ug8=
Expand All @@ -300,8 +289,6 @@ golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3/go.mod h1:t9HGtf8HONx5eT2rtn
golang.org/x/net v0.0.0-20190620200207-3b0461eec859/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20200226121028-0de0cce0169b/go.mod h1:z5CRVTTTmAJ677TzLLGU+0bjPO0LkuOLi4/5GtJWs/s=
golang.org/x/net v0.0.0-20201021035429-f5854403a974/go.mod h1:sp8m0HH+o8qH0wwXwYZr8TS3Oi6o0r6Gce1SSxlDquU=
golang.org/x/net v0.32.0 h1:ZqPmj8Kzc+Y6e0+skZsuACbx+wzMgo5MQsJh9Qd6aYI=
golang.org/x/net v0.32.0/go.mod h1:CwU0IoeOlnQQWJ6ioyFrfRuomB8GKF6KbYXZVyeXNfs=
golang.org/x/net v0.33.0 h1:74SYHlV8BIgHIFC/LrYkOGIwL19eTYXQ5wc6TBuO36I=
golang.org/x/net v0.33.0/go.mod h1:HXLR5J+9DxmrqMwG9qjGCxZ+zKXxBru04zlTvWlWuN4=
golang.org/x/oauth2 v0.24.0 h1:KTBBxWqUa0ykRPLtV69rRto9TLXcqYkeswu48x/gvNE=
Expand Down
Loading
Loading