-
Notifications
You must be signed in to change notification settings - Fork 91
Description
Description
Currently, when working with Kubernetes manifests or any YAML data structured as a list of records, there is no straightforward one-line command to compare corresponding fields across all records. A common use case is comparing the desired state (.spec.forProvider) with the observed state (.status.atProvider) for multiple resources, which are typically returned in a List kind manifest.
This issue proposes a need for a more ergonomic solution, potentially involving new features or a new pattern for yq and dyff, that would allow for a simple, non-scripted comparison of document pairs within a stream.
Motivation
In a GitOps or a declarative configuration workflow, it is crucial to quickly identify and report configuration drift. A controller might be tasked with reconciling a desired state (e.g., a database's requested version) with the actual state of the external resource (e.g., the provider's assigned version).
The current methods for comparing these values for multiple resources require a multi-line shell script with a loop, which is not ideal for ad-hoc diagnostics or use in short CI/CD steps. A one-line command would significantly improve the developer experience and operational efficiency for SREs and platform engineers.
General Case: Comparing lists of "Before" and "After" records
Consider a general YAML structure where you have a list of records, and each record contains a "before" and an "after" state.
apiVersion: v1
kind: List
items:
- before:
key1: valueA
key2: valueB
after:
key1: valueA
key2: valueC # This key is different
- before:
key1: valueX
key2: valueY
after:
key1: valueX
key2: valueY # These keys are the sameThe goal is to produce a diff for each pair of before and after records, one after the other.
Example with Kubernetes Custom Resources
Let's use a more concrete example with Kubernetes Custom Resources, where we want to compare .spec.forProvider against .status.atProvider.
Input Manifest (kubectl get databaseinstance -o yaml):
apiVersion: v1
kind: List
items:
- apiVersion: database.example.org/v1alpha1
kind: DatabaseInstance
metadata:
name: my-database-1-diff
spec:
forProvider:
engineVersion: "14"
storageGB: 20
status:
atProvider:
engineVersion: "14.7" # Differs from spec.forProvider
storageGB: 20
- apiVersion: database.example.org/v1alpha1
kind: DatabaseInstance
metadata:
name: my-database-2-no-diff
spec:
forProvider:
engineVersion: "15"
storageGB: 50
status:
atProvider:
engineVersion: "15"
storageGB: 50Current Functional, but Multi-line Solution
The most reliable approach today uses a shell loop.
Command:
kubectl get databaseinstance -o yaml | yq -I=0 -o=json '.items[] | {"kind": .kind, "namespace": .metadata.namespace, "name": .metadata.name, "spec": .spec.forProvider, "status": .status.atProvider}' | while read -r item; do
kind=$(echo "$item" | yq '.kind')
namespace=$(echo "$item" | yq '.namespace')
name=$(echo "$item" | yq '.name')
spec=$(echo "$item" | yq '.spec')
status=$(echo "$item" | yq '.status')
echo "--- Comparing '-n=${namespace} ${kind} ${name}' ---"
dyff between <(echo "$spec") <(echo "$status")
doneResulting Diff Output:
--- Comparing '-n=null DatabaseInstance my-database-1-diff' ---
_ __ __
_| |_ _ / _|/ _| between /tmp/sh-interp-15d00a1f8faa0072
/ _' | | | | |_| |_ and /tmp/sh-interp-10cf1222f3464435
| (_| | |_| | _| _|
\__,_|\__, |_| |_| returned one difference
|___/
engineVersion
± value change
- 14
+ 14.7
--- Comparing '-n=null DatabaseInstance my-database-2-no-diff' ---
_ __ __
_| |_ _ / _|/ _| between /tmp/sh-interp-529cdc4cd471de1
/ _' | | | | |_| |_ and /tmp/sh-interp-8c59a373ac43b121
| (_| | |_| | _| _|
\__,_|\__, |_| |_| returned no differences
|___/
The Challenge: A True One-liner
The core problem is that dyff between requires exactly two inputs. While yq can output multiple documents, it's not a pair-wise stream processor. A command like kubectl get ... | yq '.items[] | .spec.forProvider, .status.atProvider' | dyff between - fails because dyff receives a stream of 4 documents (spec1, status1, spec2, status2) and cannot pair them correctly.
A potential solution would be a new dyff mode or a clever yq trick that allows it to process the input stream in a pair-wise fashion. This would enable a one-line command that is both efficient and readable.
This could be a valuable enhancement for both yq and dyff that addresses a common pain point in the Kubernetes and broader YAML ecosystem.