diff --git a/AGENTS.md b/AGENTS.md index a830e2de..c9b7b73f 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -19,6 +19,14 @@ - **Avoid vacuous substring assertions in printer/formatter tests.** When asserting a `label: value` line is emitted, match against the full `"label: value"` string (or a regex), not the bare value — bare values often collide with the resource's `Name` or other surrounding context in the fixture and pass even when the line is missing. - **Keep CRD enum docstrings consistent with the `+kubebuilder:validation:Enum` marker.** If the godoc says "empty matches both", either include `""` in the enum list so `field: ""` is accepted, or rephrase to "Omit to match both" so no one writes the explicit empty form. A docstring that invites a value the API server then rejects is a worse contract than either alternative. - **Qualify cross-CRD field references with the owning kind in docs.** In a CRD reference section, write `Task.spec.podOverrides.env` rather than bare `podOverrides.env` when describing a field that lives on a sibling CRD. A reader of one CRD's reference page should be able to locate the cited field without already knowing the layout of the others. +- **CRD admission validation must match runtime checks.** When a CRD field embeds an upstream Kubernetes type (e.g. `corev1.EnvVar`, `corev1.VolumeSource`) but the controller honors only a subset of variants at reconcile time, restrict the OpenAPI surface — either via `+kubebuilder:validation:XValidation` CEL rules on the field, or by defining a narrowed Kelos-specific type that embeds only the supported sub-shapes. Letting admission accept inputs the controller later rejects (e.g. `fieldRef`/`resourceFieldRef`/`fileKeyRef` on an MCP `env` entry) turns `kubectl apply` into a misleading success and defers failure to runtime. The same applies to `value` / `valueFrom` mutual exclusion and other invariants — express them as CEL so they fail at admission, not in a reconcile error. +- **Mirror upstream Kubernetes semantics when reusing upstream types.** When a CRD field is typed as a `corev1` shape, users will reasonably assume kubelet/pod-spec behavior. For example, an `EnvVar` with `valueFrom.secretKeyRef.optional: true` referring to a missing key means kubelet **skips the variable on the container**, not sets it to `""`. If the controller cannot match upstream behavior exactly, call out the divergence explicitly in the field godoc rather than silently surfacing a different value to the consumer. +- **Breaking CRD storage-shape changes need explicit upgrade notes in `release-note`.** When a schema change makes existing in-cluster objects fail to round-trip (e.g. a field changes from `map[string]string` to a list of structs), the release note must explicitly state that operators need to re-apply existing resources after the upgrade. Describing only the new YAML shape leaves cluster owners to discover the breakage at apply time. +- **Parameterize e2e tests across all supported agent types.** When adding an e2e test for shared agent functionality (Tasks, AgentConfigs, MCP plugins, skills, credential injection, etc.), wrap the `Describe` block in a function that takes an `agentTestConfig` and invoke it from `for _, cfg := range agentConfigs { describeXxx(cfg) }` — do not hard-code a single agent type. A test that only exercises `codex` (or only `claude-code`) silently lets the other agents drift. Single-agent tests are only appropriate when the behavior under test is genuinely agent-specific (e.g., a codex-only entrypoint flag); call out that scoping in the test's `Describe` text. +- **Prefer agent-native model shorthands in docs and examples.** When documenting `spec.model`, `taskTemplate.model`, or the `model:` config field, use the agent CLI's accepted shorthand (e.g., `sonnet`, `opus` for Claude Code) instead of — or alongside — the full versioned ID (`claude-sonnet-4-6`). The value is passed through to the agent container as `KELOS_MODEL` without validation, so both forms work, but shorthands age better than versioned IDs in user-facing examples. If you must show a versioned ID, pair it with the shorthand on the same line so readers know which form to prefer. +- **Wire new agent runtime images through the full stack in one PR.** Adding an agent image directory (Dockerfile + `kelos_entrypoint.sh`) is not enough to make the agent reachable: the same PR must also extend the `+kubebuilder:validation:Enum` on `Task.spec.type` and `TaskTemplate.type` (`api/v1alpha1/task_types.go`, `api/v1alpha1/taskspawner_types.go`), add a case in `internal/controller/job_builder.go` (`Build`, default-image constants, `apiKeyEnvVar`/`oauthEnvVar` credential mapping), update `docs/agent-image-interface.md`'s `KELOS_AGENT_TYPE` enumeration, and add at least a smoke e2e test that creates a Task with the new `type:`. Otherwise `kind load` succeeds, the image sits in the registry, and `kubectl apply` rejects the manifest with `unsupported agent type` — which reviewers will read as a half-shipped feature. If you genuinely want to ship the image artifact first, state that explicitly in the PR description and link the follow-up issue tracking the CRD/controller wiring, so "No API/CRD changes" doesn't get read as "feature-complete." +- **Pin third-party CLI installs in agent-image Dockerfiles.** Match the existing `ARG XXX_VERSION=N.N.N` + pinned-URL pattern used by `gemini/`, `codex/`, `opencode/`, and `cursor/` Dockerfiles. Do not use `curl … | bash` of an unversioned vendor installer (e.g. `curl https://vendor.tld/cli/install.sh | bash`); two rebuilds of the same commit will produce different binaries and break reproducibility. When the vendor only ships a "latest" installer, prefer a pinned tarball URL with `sha256sum -c` verification (see the Go install in `gemini/Dockerfile`); if even that is not yet available, file an issue tracking the eventual pin and call out the unpinned step in the PR description so reviewers don't have to rediscover it. +- **Propagate deferred `Close()`/`Flush()` errors via a named return.** A bare `defer bw.Flush()` (or `defer f.Close()`) silently discards the error, so a transient failure on the trailing flush drops the last buffered bytes without any signal to the caller. Use a named return and `defer func() { if cerr := bw.Flush(); err == nil { err = cerr } }()` whenever the deferred call returns an error that the function's contract should surface. Same applies to deferring `Close()` on a writer (logs, on-disk artifacts, network connections) where loss matters; bare `defer f.Close()` is only acceptable on readers or where the caller can recover from late-write failure. ## Key Makefile Targets - `make verify` — run all verification checks (lint, fmt, vet, etc.). diff --git a/self-development/agentconfig.yaml b/self-development/agentconfig.yaml index e3342d2d..6863c189 100644 --- a/self-development/agentconfig.yaml +++ b/self-development/agentconfig.yaml @@ -42,6 +42,14 @@ spec: - Avoid vacuous substring assertions in printer/formatter tests: when asserting a `label: value` line is emitted, match the full `"label: value"` string (or a regex), not the bare value — bare values frequently collide with the fixture's `Name` or surrounding context and pass even when the line is missing. - Keep CRD enum docstrings consistent with the `+kubebuilder:validation:Enum` marker: if the godoc says "empty matches both", either include `""` in the enum list so `field: ""` is accepted, or rephrase to "Omit to match both" so no one writes the explicit empty form. A docstring that invites a value the API server then rejects is a worse contract than either alternative. - Qualify cross-CRD field references with the owning kind in docs: in a CRD reference section, write `Task.spec.podOverrides.env` rather than bare `podOverrides.env` when describing a field that lives on a sibling CRD. A reader of one CRD's reference page should be able to locate the cited field without already knowing the layout of the others. + - CRD admission validation must match runtime checks: when a CRD field embeds an upstream Kubernetes type (e.g. `corev1.EnvVar`, `corev1.VolumeSource`) but the controller honors only a subset of variants at reconcile time, restrict the OpenAPI surface — either via `+kubebuilder:validation:XValidation` CEL rules on the field, or by defining a narrowed Kelos-specific type that embeds only the supported sub-shapes. Letting admission accept inputs the controller later rejects (e.g. `fieldRef`/`resourceFieldRef`/`fileKeyRef` on an MCP `env` entry) turns `kubectl apply` into a misleading success and defers failure to runtime. The same applies to `value` / `valueFrom` mutual exclusion and other invariants — express them as CEL so they fail at admission, not in a reconcile error. + - Mirror upstream Kubernetes semantics when reusing upstream types: when a CRD field is typed as a `corev1` shape, users will reasonably assume kubelet/pod-spec behavior. For example, an `EnvVar` with `valueFrom.secretKeyRef.optional: true` referring to a missing key means kubelet skips the variable on the container, not sets it to `""`. If the controller cannot match upstream behavior exactly, call out the divergence explicitly in the field godoc rather than silently surfacing a different value to the consumer. + - Breaking CRD storage-shape changes need explicit upgrade notes in `release-note`: when a schema change makes existing in-cluster objects fail to round-trip (e.g. a field changes from `map[string]string` to a list of structs), the release note must explicitly state that operators need to re-apply existing resources after the upgrade. Describing only the new YAML shape leaves cluster owners to discover the breakage at apply time. + - Parameterize e2e tests across all supported agent types: when adding an e2e test for shared agent functionality (Tasks, AgentConfigs, MCP plugins, skills, credential injection, etc.), wrap the `Describe` block in a function that takes an `agentTestConfig` and invoke it from `for _, cfg := range agentConfigs { describeXxx(cfg) }` — do not hard-code a single agent type. A test that only exercises `codex` (or only `claude-code`) silently lets the other agents drift. Single-agent tests are only appropriate when the behavior under test is genuinely agent-specific (e.g., a codex-only entrypoint flag); call out that scoping in the test's `Describe` text. + - Prefer agent-native model shorthands in docs and examples: when documenting `spec.model`, `taskTemplate.model`, or the `model:` config field, use the agent CLI's accepted shorthand (e.g., `sonnet`, `opus` for Claude Code) instead of — or alongside — the full versioned ID (`claude-sonnet-4-6`). The value is passed through to the agent container as `KELOS_MODEL` without validation, so both forms work, but shorthands age better than versioned IDs in user-facing examples. If you must show a versioned ID, pair it with the shorthand on the same line so readers know which form to prefer. + - Wire new agent runtime images through the full stack in one PR: adding an agent image directory (Dockerfile + `kelos_entrypoint.sh`) is not enough to make the agent reachable. The same PR must also extend the `+kubebuilder:validation:Enum` on `Task.spec.type` and `TaskTemplate.type`, add a case in `internal/controller/job_builder.go` (Build, default-image constants, `apiKeyEnvVar`/`oauthEnvVar` credential mapping), update `docs/agent-image-interface.md`'s `KELOS_AGENT_TYPE` enumeration, and add at least a smoke e2e test that creates a Task with the new `type:`. Otherwise `kind load` succeeds, the image sits in the registry, and `kubectl apply` rejects the manifest with `unsupported agent type`. If you genuinely want to ship the image artifact first, state that explicitly in the PR description and link the follow-up issue, so "No API/CRD changes" doesn't get read as "feature-complete." + - Pin third-party CLI installs in agent-image Dockerfiles: match the existing `ARG XXX_VERSION=N.N.N` + pinned-URL pattern used by `gemini/`, `codex/`, `opencode/`, and `cursor/` Dockerfiles. Do not use `curl … | bash` of an unversioned vendor installer; two rebuilds of the same commit will produce different binaries and break reproducibility. When the vendor only ships a "latest" installer, prefer a pinned tarball URL with `sha256sum -c` verification; if even that is not yet available, file an issue tracking the eventual pin and call out the unpinned step in the PR description. + - Propagate deferred `Close()`/`Flush()` errors via a named return: a bare `defer bw.Flush()` (or `defer f.Close()`) silently discards the error, so a transient failure on the trailing flush drops the last buffered bytes without any signal to the caller. Use a named return and `defer func() { if cerr := bw.Flush(); err == nil { err = cerr } }()` whenever the deferred call returns an error that the function's contract should surface. Same applies to deferring `Close()` on a writer where loss matters; bare `defer f.Close()` is only acceptable on readers or where the caller can recover from late-write failure. - PRs that only modify files under `self-development/` are internal agent improvements: use `/kind cleanup` and write "NONE" in the `release-note` block, even when the change fixes a bug or adds a feature in agent behavior. Classify by file location, not by problem nature. - TaskSpawner conventions (for `self-development/` YAML files): - Prefer webhook-based triggers (`githubWebhook`) over poll-based (`githubPullRequests`) for real-time event-driven tasks diff --git a/self-development/kelos-workers.yaml b/self-development/kelos-workers.yaml index ba26640b..c6538a35 100644 --- a/self-development/kelos-workers.yaml +++ b/self-development/kelos-workers.yaml @@ -41,6 +41,14 @@ spec: - Avoid vacuous substring assertions in printer/formatter tests: when asserting a `label: value` line is emitted, match the full `"label: value"` string (or a regex), not the bare value — bare values frequently collide with the fixture's `Name` or surrounding context and pass even when the line is missing. - Keep CRD enum docstrings consistent with the `+kubebuilder:validation:Enum` marker: if the godoc says "empty matches both", either include `""` in the enum list so `field: ""` is accepted, or rephrase to "Omit to match both" so no one writes the explicit empty form. A docstring that invites a value the API server then rejects is a worse contract than either alternative. - Qualify cross-CRD field references with the owning kind in docs: in a CRD reference section, write `Task.spec.podOverrides.env` rather than bare `podOverrides.env` when describing a field that lives on a sibling CRD. A reader of one CRD's reference page should be able to locate the cited field without already knowing the layout of the others. + - CRD admission validation must match runtime checks: when a CRD field embeds an upstream Kubernetes type (e.g. `corev1.EnvVar`, `corev1.VolumeSource`) but the controller honors only a subset of variants at reconcile time, restrict the OpenAPI surface — either via `+kubebuilder:validation:XValidation` CEL rules on the field, or by defining a narrowed Kelos-specific type that embeds only the supported sub-shapes. Letting admission accept inputs the controller later rejects (e.g. `fieldRef`/`resourceFieldRef`/`fileKeyRef` on an MCP `env` entry) turns `kubectl apply` into a misleading success and defers failure to runtime. The same applies to `value` / `valueFrom` mutual exclusion and other invariants — express them as CEL so they fail at admission, not in a reconcile error. + - Mirror upstream Kubernetes semantics when reusing upstream types: when a CRD field is typed as a `corev1` shape, users will reasonably assume kubelet/pod-spec behavior. For example, an `EnvVar` with `valueFrom.secretKeyRef.optional: true` referring to a missing key means kubelet skips the variable on the container, not sets it to `""`. If the controller cannot match upstream behavior exactly, call out the divergence explicitly in the field godoc rather than silently surfacing a different value to the consumer. + - Breaking CRD storage-shape changes need explicit upgrade notes in `release-note`: when a schema change makes existing in-cluster objects fail to round-trip (e.g. a field changes from `map[string]string` to a list of structs), the release note must explicitly state that operators need to re-apply existing resources after the upgrade. Describing only the new YAML shape leaves cluster owners to discover the breakage at apply time. + - Parameterize e2e tests across all supported agent types: when adding an e2e test for shared agent functionality (Tasks, AgentConfigs, MCP plugins, skills, credential injection, etc.), wrap the `Describe` block in a function that takes an `agentTestConfig` and invoke it from `for _, cfg := range agentConfigs { describeXxx(cfg) }` — do not hard-code a single agent type. A test that only exercises `codex` (or only `claude-code`) silently lets the other agents drift. Single-agent tests are only appropriate when the behavior under test is genuinely agent-specific (e.g., a codex-only entrypoint flag); call out that scoping in the test's `Describe` text. + - Prefer agent-native model shorthands in docs and examples: when documenting `spec.model`, `taskTemplate.model`, or the `model:` config field, use the agent CLI's accepted shorthand (e.g., `sonnet`, `opus` for Claude Code) instead of — or alongside — the full versioned ID (`claude-sonnet-4-6`). The value is passed through to the agent container as `KELOS_MODEL` without validation, so both forms work, but shorthands age better than versioned IDs in user-facing examples. If you must show a versioned ID, pair it with the shorthand on the same line so readers know which form to prefer. + - Wire new agent runtime images through the full stack in one PR: adding an agent image directory (Dockerfile + `kelos_entrypoint.sh`) is not enough to make the agent reachable. The same PR must also extend the `+kubebuilder:validation:Enum` on `Task.spec.type` and `TaskTemplate.type`, add a case in `internal/controller/job_builder.go` (Build, default-image constants, `apiKeyEnvVar`/`oauthEnvVar` credential mapping), update `docs/agent-image-interface.md`'s `KELOS_AGENT_TYPE` enumeration, and add at least a smoke e2e test that creates a Task with the new `type:`. Otherwise `kind load` succeeds, the image sits in the registry, and `kubectl apply` rejects the manifest with `unsupported agent type`. If you genuinely want to ship the image artifact first, state that explicitly in the PR description and link the follow-up issue, so "No API/CRD changes" doesn't get read as "feature-complete." + - Pin third-party CLI installs in agent-image Dockerfiles: match the existing `ARG XXX_VERSION=N.N.N` + pinned-URL pattern used by `gemini/`, `codex/`, `opencode/`, and `cursor/` Dockerfiles. Do not use `curl … | bash` of an unversioned vendor installer; two rebuilds of the same commit will produce different binaries and break reproducibility. When the vendor only ships a "latest" installer, prefer a pinned tarball URL with `sha256sum -c` verification; if even that is not yet available, file an issue tracking the eventual pin and call out the unpinned step in the PR description. + - Propagate deferred `Close()`/`Flush()` errors via a named return: a bare `defer bw.Flush()` (or `defer f.Close()`) silently discards the error, so a transient failure on the trailing flush drops the last buffered bytes without any signal to the caller. Use a named return and `defer func() { if cerr := bw.Flush(); err == nil { err = cerr } }()` whenever the deferred call returns an error that the function's contract should surface. Same applies to deferring `Close()` on a writer where loss matters; bare `defer f.Close()` is only acceptable on readers or where the caller can recover from late-write failure. - PRs that only modify files under `self-development/` are internal agent improvements: use `/kind cleanup` and write "NONE" in the `release-note` block, even when the change fixes a bug or adds a feature in agent behavior. Classify by file location, not by problem nature. - TaskSpawner conventions (for `self-development/` YAML files): - Prefer webhook-based triggers (`githubWebhook`) over poll-based (`githubPullRequests`) for real-time event-driven tasks