@@ -143,8 +143,8 @@ is `100m`, the number of replicas will be doubled, since
143
143
\\ ( { 200.0 \div 100.0 } = 2.0 \\ ).
144
144
If the current value is instead ` 50m ` , you'll halve the number of
145
145
replicas, since \\ ( { 50.0 \div 100.0 } = 0.5 \\ ). The control plane skips any scaling
146
- action if the ratio is sufficiently close to 1.0 (within a globally-configurable
147
- tolerance, 0.1 by default).
146
+ action if the ratio is sufficiently close to 1.0 (within a
147
+ [ configurable tolerance] ( #tolerance ) , 0.1 by default).
148
148
149
149
When a ` targetAverageValue ` or ` targetAverageUtilization ` is specified,
150
150
the ` currentMetricValue ` is computed by taking the average of the given
@@ -388,9 +388,10 @@ to configure separate scale-up and scale-down behaviors.
388
388
You specify these behaviours by setting `scaleUp` and / or `scaleDown`
389
389
under the `behavior` field.
390
390
391
- You can specify a _stabilization window_ that prevents [flapping](#flapping)
392
- the replica count for a scaling target. Scaling policies also let you control the
393
- rate of change of replicas while scaling.
391
+ Scaling policies let you control the rate of change of replicas while scaling.
392
+ Also two settings can be used to prevent [flapping](#flapping): you can specify a
393
+ _stabilization window_ for smoothing replica counts, and a tolerance to ignore
394
+ minor metric fluctuations below a specified threshold.
394
395
395
396
# ## Scaling policies
396
397
@@ -452,6 +453,32 @@ interval. In the above example, all desired states from the past 5 minutes will
452
453
This approximates a rolling maximum, and avoids having the scaling algorithm frequently
453
454
remove Pods only to trigger recreating an equivalent Pod just moments later.
454
455
456
+ # ## Tolerance {#tolerance}
457
+
458
+ {{< feature-state feature_gate_name="HPAConfigurableTolerance" >}}
459
+
460
+ The `tolerance` field configures a threshold for metric variations, preventing the
461
+ autoscaler from scaling for changes below that value.
462
+
463
+ This tolerance is defined as the amount of variation around the desired metric value under
464
+ which no scaling will occur. For example, consider a HorizontalPodAutoscaler configured
465
+ with a target memory consumption of 100MiB and a scale-up tolerance of 5% :
466
+
467
+ ` ` ` yaml
468
+ behavior:
469
+ scaleUp:
470
+ tolerance: 0.05 # 5% tolerance for scale up
471
+ ` ` `
472
+
473
+ With this configuration, the HPA algorithm will only consider scaling up if the memory
474
+ consumption is higher than 105MiB (that is : 5% above the target).
475
+
476
+ If you don't set this field, the HPA applies the default cluster-wide tolerance of 10%. This
477
+ default can be updated for both scale-up and scale-down using the
478
+ [kube-controller-manager](/docs/reference/command-line-tools-reference/kube-controller-manager/)
479
+ ` --horizontal-pod-autoscaler-tolerance` command line argument. (You can't use the Kubernetes API
480
+ to configure this default value.)
481
+
455
482
# ## Default Behavior
456
483
457
484
To use the custom scaling not all fields have to be specified. Only values which need to be
0 commit comments