Auto mode is the production end-state for attune. The operator continuously resizes all eligible pods based on observed metrics. Before enabling Auto mode, you should have validated recommendations through Recommend and/or Canary mode.
Before switching to Auto mode:
- Run in Recommend mode for at least 1 full history window (default 7 days) to build confidence in the recommendations
- Verify recommendations are reasonable using the kubectl plugin:
kubectl attune recommendations -n <namespace>
- Test with Canary mode (optional but recommended) to validate resizes on a subset of pods before the full fleet
- Configure appropriate bounds to prevent extreme recommendations:
cpu: minAllowed: "50m" # never go below 50 millicores maxAllowed: "4000m" # never exceed 4 cores memory: minAllowed: "64Mi" # never go below 64 MiB maxAllowed: "8Gi" # never exceed 8 GiB
apiVersion: attune.io/v1alpha1
kind: AttunePolicy
metadata:
name: my-app
namespace: production
spec:
targetRef:
kind: Deployment
selector:
matchLabels:
tier: api
metricsSource:
prometheus:
address: http://prometheus-server.monitoring:80
historyWindow: 168h # 7 days of data
cpu:
percentile: 95
overhead: "20"
minAllowed: "50m"
maxAllowed: "4000m"
controlledValues: RequestsAndLimits
memory:
percentile: 99
overhead: "30"
minAllowed: "64Mi"
maxAllowed: "8Gi"
controlledValues: RequestsAndLimits
updateStrategy:
type: Auto
cooldown: 1h
autoRevert: true| Setting | Purpose | Suggested value |
|---|---|---|
overhead |
Headroom above observed usage | 20% (CPU), 30% (memory) |
minAllowed/maxAllowed |
Prevent extreme recommendations | Match your resource limits policy |
cooldown |
Time between resizes | 1h minimum for production |
autoRevert |
Roll back if pods become unhealthy | true for production |
The safety monitor watches each resized pod for an observation period before
declaring the resize successful. The default is 5 minutes. To configure it,
set safetyObservationPeriod:
spec:
updateStrategy:
type: Auto
autoRevert: true
safetyObservationPeriod: 10m # safety watch window after each resize- CPU: 20% overhead works well for steady-state services. Use 50% for bursty workloads.
- Memory: 30% overhead is recommended because memory pressure causes OOM kills. Never go below 10% for production.
# Overview of all policies
kubectl attune status -A
# Estimated savings
kubectl attune savings -n production
# Detailed per-container recommendations
kubectl attune recommendations -n productionThe operator sets a Degraded condition when 3 or more of the last 5 resizes are reverted.
Monitor this with:
kubectl get attunepolicy -A -o jsonpath='{range .items[*]}{.metadata.namespace}/{.metadata.name}: {range .status.conditions[*]}{.type}={.reason} {end}{"\n"}{end}'The operator exports metrics for dashboarding:
attune_recommendation_cpu_cores-- Recommended CPU per workloadattune_recommendation_memory_bytes-- Recommended memory per workloadattune_confidence-- Confidence score (0-1) per workloadattune_resize_total-- Total successful, failed, and reverted in-place resize operationsattune_eviction_total-- Total eviction fallback attempts whenresizeMethod: InPlaceOrRecreateattune_reverts_total-- Total reverts (broken down by reason)
Alert on high revert rates:
- alert: AttuneHighRevertRate
expr: rate(attune_reverts_total[1h]) > 0.1
for: 10m
annotations:
summary: "High revert rate for {{ $labels.namespace }}/{{ $labels.workload }}"By default, resizes can occur at any time. Use the schedule field to restrict
resizes to specific time windows and days of the week. Recommendations are always
computed; only the actual resize execution is gated.
spec:
updateStrategy:
type: Auto
schedule:
windows:
- start: "02:00"
end: "06:00"
daysOfWeek: ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"]
timezone: "America/New_York"Key behavior:
- If
daysOfWeekis omitted, all days are allowed. - If
windowsis omitted, all times are allowed (only day filtering applies). - Overnight windows work:
start: "22:00", end: "06:00"wraps past midnight. - The
ScheduleBlockedstatus condition is set when outside the window. - An invalid timezone name fails open (resizes are allowed) to prevent silent lockout from a typo.
Combine scheduling with budget caps for large fleets:
spec:
updateStrategy:
type: Auto
schedule:
windows:
- start: "02:00"
end: "06:00"
maxConcurrentResizes: 10
maxTotalCpuIncrease: "2000m"
maxTotalMemoryIncrease: "4Gi"See examples/12-scheduled-auto-mode.yaml for a complete example.
If resizes are blocked unexpectedly, see the troubleshooting guide for schedule-specific diagnostics.
The export feature writes recommendation data to ConfigMaps for external
consumption (e.g., GitOps workflows with ArgoCD or Flux that apply resource
patches from CI/CD rather than letting the operator resize directly).
spec:
updateStrategy:
type: Recommend # or Auto
export:
configMap: trueWhen enabled, the operator creates one ConfigMap per workload, named
<policy>-<workload>-recommendations, with an owner reference to the policy
for automatic cleanup when the policy itself is deleted.
The ConfigMap contains per-container recommended CPU and memory values plus
a last-updated timestamp (RFC3339).
Example ConfigMap content:
apiVersion: v1
kind: ConfigMap
metadata:
name: my-app-my-deployment-recommendations
namespace: default
labels:
attune.io/policy: my-app
attune.io/workload: my-deployment
data:
workload: my-deployment
kind: Deployment
main.cpu-request: "250m"
main.memory-request: "512Mi"
main.cpu-limit: "500m"
main.memory-limit: "1Gi"
main.confidence: "0.92"
last-updated: "2026-05-29T14:30:00Z"Inspect exports with the plugin (recommended over raw kubectl get cm):
kubectl attune export list -n <ns>
# or with last-updated and container counts across all ns
kubectl attune export -AOrphan cleanup: When a workload leaves the policy's selector (for example
after a selector change or workload deletion while the policy still exists),
the corresponding recommendation ConfigMap is automatically deleted on the
next reconcile. This prevents stale recommendation data from lingering for
GitOps consumers. Only ConfigMaps carrying the attune.io/policy label for
that specific policy are considered for cleanup.
Any ConfigMap in the policy's namespace bearing the attune.io/policy label
is treated as owned by this AttunePolicy for cleanup purposes. This is an
intentional part of the feature's trust model (label-based management within
the namespace).
This is useful in GitOps workflows where:
- The operator runs in Recommend mode to compute recommendations.
- A CI/CD pipeline reads the ConfigMaps and generates resource patches.
- ArgoCD or Flux applies the patches through the normal GitOps flow.
kubectl patch attunepolicy my-app --type merge \
-p '{"spec":{"updateStrategy":{"type":"Auto","autoRevert":true}}}'kubectl patch attunepolicy my-app --type merge \
-p '{"spec":{"updateStrategy":{"type":"Auto"}}}'If Auto mode causes issues, switch back to Recommend immediately:
kubectl patch attunepolicy my-app --type merge \
-p '{"spec":{"updateStrategy":{"type":"Recommend"}}}'This stops all future resizes. Already-resized pods keep their current resources until their next restart.