self-development: distill recurring conventions from PR review feedback#786
self-development: distill recurring conventions from PR review feedback#786kelos-bot[bot] wants to merge 6 commits intomainfrom
Conversation
Add three new agent conventions from recent PR review feedback: 1. Per-TaskSpawner configuration should be CRD fields, not controller flags (PR #838 - gjkim42 review) 2. CRD API backward compatibility - never rename JSON field tags (PR #838 - P1 review finding) 3. Gate optional CRDs behind Helm values flags (PR #821 - PodMonitoring broke installs on clusters without monitoring.googleapis.com) Also includes previously proposed conventions from PR #786: - Consistent guidance across surfaces - Provider-agnostic API design - Idiomatic Helm values - Deploy-dev workflow sync - Controller-driven migration - Release note user action requirements Co-Authored-By: Claude Opus 4.6 <[email protected]>
bcdeef3 to
7ade17e
Compare
|
/kelos pick-up |
|
🤖 Kelos Task Status Task |
|
🤖 Kelos Agent @gjkim42 /kelos needs-input Rebased on origin/main (was already up to date). Added The only failing CI check is Ready for re-review. |
7ade17e to
782ce1c
Compare
0871329 to
6379545
Compare
6f6241c to
41ed5c4
Compare
41ed5c4 to
5f6baa8
Compare
5f6baa8 to
675ac75
Compare
eb7fce3 to
f472430
Compare
…eedback Adds conventions learned from recent PR reviews: 1. Fail fast on invalid configuration (PR #971): three P1 and four P2 issues flagged silent degradation when credentials or config were invalid, falling back to unauthenticated requests instead of erroring. 2. No manual PR branch checkout in TaskSpawner prompts (PR #974): Kelos already checks out the PR branch automatically; manual checkout instructions are redundant and confusing. Also carries forward the previously proposed changes from PR #786: - os.Getenv() secret-in-flag-defaults convention (PR #971) - TaskSpawner creation conventions (PR #965) - Branch template variable documentation fix (PR #965) Co-Authored-By: Claude Opus 4.6 <[email protected]>
1d88c65 to
0d360ce
Compare
Greptile SummaryThis PR distills eight recurring conventions surfaced by recent PR review feedback and propagates them across
Confidence Score: 5/5Documentation and configuration-only changes; no runtime Go code is modified. The spawner fixes are conservative and correct. All changes are YAML configuration and Markdown. The branch-fallback fix in
|
| Filename | Overview |
|---|---|
| AGENTS.md | Adds nine new engineering conventions covering secrets handling, fail-fast config, API surface, backward compatibility, docs accuracy, Gomega polling, CLI errors, happy-path testing, and substring assertions. |
| self-development/agentconfig.yaml | Propagates all new conventions from AGENTS.md into the shared AgentConfig, including the TaskSpawner convention block with the corrected branch fallback form. |
| self-development/kelos-reviewer.yaml | Branch field fixed from bare {{.Branch}} to safe fallback form; reviewer checklist expanded with new correctness, tests, security, docs-accuracy, and project-convention items. |
| self-development/kelos-squash-commits.yaml | Adds commentOn: PullRequest guard to prevent firing on plain issues; branch field still uses bare {{.Branch}} rather than the safe fallback form being documented in this PR. |
| self-development/kelos-workers.yaml | Propagates new conventions from AGENTS.md into the worker AgentConfig spec, including the TaskSpawner conventions block. |
| self-development/kelos-api-reviewer.yaml | Extends the API compatibility checklist with three new bullets covering minimal API surface, backward compatibility (kind changes, validation tightening, deprecation-first), and stale manifest sweep. |
| self-development/README.md | Corrects the {{.Branch}} description from 'Usually PR head branch' to 'PR head branch; empty for issue events' to match runtime behaviour. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[issue_comment / pull_request_review webhook] --> B{Spawner filter}
B -->|kelos-reviewer\nissue_comment or pr_review| C[Branch fallback\nwith index Branch\nelse main]
B -->|kelos-squash-commits\nissue_comment + commentOn:PullRequest\nor pull_request_review| D[branch: .Branch\nalways populated\nPR-only context]
B -->|kelos-workers\nissue_comment + commentOn:Issue| E[branch: kelos-task-N\nstatic, no fallback needed]
C --> F[Reviewer task\ngh pr review]
D --> G[Squash task\ngit push --force-with-lease]
E --> H[Worker task\ncreate/update PR]
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
self-development/kelos-squash-commits.yaml:55
The `branch` field in `kelos-squash-commits.yaml` still uses the bare `{{.Branch}}` form, which conflicts with the TaskSpawner convention being added in this very PR. While the new `commentOn: PullRequest` guard makes this functionally safe (branch is always populated from a PR), the convention text added to `agentconfig.yaml` and `kelos-workers.yaml` prescribes the `{{with index . "Branch"}}{{.}}{{else}}main{{end}}` form when the spawner listens on `issue_comment` — and the `pull_request_review` filter also exists here, making consistent defensive templating worthwhile.
```suggestion
branch: '{{with index . "Branch"}}{{.}}{{else}}main{{end}}'
```
Reviews (7): Last reviewed commit: "self-development: add CLI error message ..." | Re-trigger Greptile
7aff731 to
6a74ec3
Compare
Adds five conventions distilled from recent PR reviews and applies them across AGENTS.md, the shared and worker AgentConfigs, and the reviewer prompts. 1. Never use os.Getenv() for secrets as Go flag defaults (PR #971) — flag prints defaults in --help output, leaking secret values. 2. Fail fast on invalid configuration (PR #971) — do not silently fall back to unauthenticated/degraded behavior when credentials or config are missing. 3. Keep API surfaces minimal (PR #902) — only fields immediately needed, no speculative additions; API is hard to change once shipped. 4. Docs must match implementation, not aspiration (PRs #1035, #1056) — describe only what the code actually does; verify enforcement before documenting a contract. 5. TaskSpawner conventions (PRs #965, #974): - Prefer webhook-based triggers over poll-based. - {{.Branch}} is empty for issue-only events; use the {{if .Branch}}{{.Branch}}{{else}}main{{end}} fallback form. - issue_comment fires for both issues and PRs; design prompts to detect and handle both contexts. - Do not include manual PR branch checkout instructions — Kelos checks out the PR branch automatically. Also fixes the kelos-reviewer TaskSpawner branch field to use the safe fallback form (was using bare {{.Branch}}, which is empty for issue events; flagged P2 in PR #786 review). Co-Authored-By: Claude Opus 4.7 <[email protected]>
|
You're iterating quickly on this pull request. To help protect your rate limits, cubic has paused automatic reviews on new pushes for now—when you're ready for another review, comment |
Adds five conventions distilled from recent PR reviews and applies them across AGENTS.md, the shared and worker AgentConfigs, and the reviewer prompts. 1. Never use os.Getenv() for secrets as Go flag defaults (PR #971) — flag prints defaults in --help output, leaking secret values. 2. Fail fast on invalid configuration (PR #971) — do not silently fall back to unauthenticated/degraded behavior when credentials or config are missing. 3. Keep API surfaces minimal (PR #902) — only fields immediately needed, no speculative additions; API is hard to change once shipped. 4. Docs must match implementation, not aspiration (PRs #1035, #1056) — describe only what the code actually does; verify enforcement before documenting a contract. 5. TaskSpawner conventions (PRs #965, #974): - Prefer webhook-based triggers over poll-based. - {{.Branch}} is empty for issue-only events; use the {{if .Branch}}{{.Branch}}{{else}}main{{end}} fallback form. - issue_comment fires for both issues and PRs; design prompts to detect and handle both contexts. - Do not include manual PR branch checkout instructions — Kelos checks out the PR branch automatically. Also fixes the kelos-reviewer TaskSpawner branch field to use the safe fallback form (was using bare {{.Branch}}, which is empty for issue events; flagged P2 in PR #786 review). Co-Authored-By: Claude Opus 4.7 <[email protected]>
PR #1058 surfaced a recurring API-change pattern not covered by the existing "Keep API surfaces minimal" rule: - A scalar -> array kind change on an existing CRD field (BodyContains) was flagged P1 because existing in-cluster resources would fail structural-schema validation; the repo itself had ~9 stale YAMLs in self-development/ and examples/ still using the old scalar form. - The maintainer asked twice about backward compatibility on a newly added BodyPattern field, and required removing +kubebuilder:validation:MinLength=1 because it would reject existing YAMLs that don't yet have the field. - The agreed migration path was to add new fields and mark the old one +deprecated rather than remove it. Adds the convention to AGENTS.md (and CLAUDE.md via symlink), to the shared agentconfig.yaml and kelos-workers.yaml, and to the kelos-api-reviewer prompt as an explicit checklist item under "API compatibility and evolution". Co-Authored-By: Claude Opus 4.7 <[email protected]>
…ing convention Greptile flagged a P2 inconsistency on PR #786 itself: the convention text in agentconfig.yaml and kelos-workers.yaml prescribed {{if .Branch}}{{.Branch}}{{else}}main{{end}}, but the canonical form used in the spawner YAMLs (kelos-reviewer.yaml, kelos-api-reviewer.yaml) is {{with index . "Branch"}}{{.}}{{else}}main{{end}}. The new "Docs must match implementation" rule applies to itself — fix the documented example to match the deployed form. PR #1084 fixed an e2e flake caused by GetTaskPhase calling the global Expect(err).NotTo(HaveOccurred()) inside an Eventually polling loop; Gomega does not retry on Expect failures, so a transient API error short-circuits the poller. Both Greptile and cubic-dev-ai flagged the same anti-pattern. Capture the lesson in AGENTS.md, the shared and worker AgentConfigs, and add a reviewer checklist item so future WaitFor* helpers either return a zero-value on error or use the Eventually(func(g Gomega) { ... }) form.
2d628c4 to
bdbd438
Compare
|
🤖 Kelos Agent @gjkim42 Pushed two new updates:
Branch was rebased on |
…y PRs PR #1100 was an agent-generated PR that modified only self-development/ files but used /kind bug. Greptile flagged it P2 against the existing CLAUDE.md/AGENTS.md rule: PRs that only modify files under self-development/ are internal agent improvements — use /kind cleanup and write NONE in the release-note block. The rule lives in the project's AGENTS.md but was not echoed into the agent's own agentsMD (where most other PR conventions are duplicated) nor surfaced in the reviewer's project-conventions checklist. So the worker missed it at PR-creation time, and the kelos-reviewer pass did not catch it before merge. Echo the rule into the shared and worker AgentConfigs, and add an explicit reviewer checklist item with a P2 severity hint so a /review pass on a self-development/-only PR will flag a wrong /kind label. The framing emphasizes "classify by file location, not by problem nature" — the failure mode in PR #1100 was the agent picking the label from the change's semantic intent (recursion-prevention bug fix) rather than from the directory scope.
|
🤖 Kelos Agent @gjkim42 Pushed one new commit (cea8c90):
Branch was rebased on |
Adds commentOn: PullRequest to the issue_comment filter in
kelos-squash-commits.yaml. Without this guard, /kelos squash-commits
typed on a plain issue triggers the spawner with empty {{.Branch}};
the prompt then runs git push --force-with-lease against whatever
branch fallback Kelos uses, which is unsafe for a destructive
operation. This matches the existing pattern in
kelos-pr-responder.yaml and is the same convention captured in the
shared agentconfig.
Reported by Greptile P2 on PR #786 ("kelos-squash-commits.yaml
(outside diff) still uses bare {{.Branch}} without a PR-only filter
and should be updated in a follow-up").
|
🤖 Kelos Agent @gjkim42 Pushed one new commit (dcc3fff): Sweeps the new convention into Branch is one commit behind |
Distills three additional conventions from review feedback on recent PRs (#1101, #1106, #1096): - CLI error messages must name the failing resource (PR #1101 P2) - Tests must cover the happy path, not only early-return guards (PR #1096 P3, flagged by both kelos-bot and greptile-apps) - Avoid vacuous substring assertions in printer/formatter tests (PR #1106 P2) Applied to AGENTS.md (and via symlink CLAUDE.md), agentconfig.yaml, kelos-workers.yaml, and the reviewer prompt's Tests / Code quality checklists.
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
Distills eleven recurring conventions from recent PR review feedback and propagates them across
AGENTS.md, the shared and workerAgentConfigs, and the reviewer prompts. All changes are backed by specific PR review findings — no speculative rules.1. Never use
os.Getenv()for secrets as Goflagdefaultscmd/kelos-webhook-server/main.goandcmd/ghproxy/main.gowhereos.Getenv("GITHUB_TOKEN")was used asflag.StringVardefault, leaking the secret in--helpoutput.AGENTS.md,agentconfig.yaml,kelos-workers.yaml,kelos-reviewer.yaml(new security checklist item).2. Fail fast on invalid configuration
AGENTS.md,agentconfig.yaml,kelos-workers.yaml,kelos-reviewer.yaml(new correctness checklist item).3. Keep API surfaces minimal
AGENTS.md,agentconfig.yaml,kelos-workers.yaml,kelos-api-reviewer.yaml(new compatibility checklist item).4. API changes must preserve backward compatibility for existing manifests
bodyContains: "/kelos"(scalar) would fail Kubernetes structural-schema validation after the CRD update; the repo itself had at least 9 YAMLs inself-development/andexamples/still using the old scalar form.+kubebuilder:validation:MinLength=1allow empty BodyPattern?" and "I just want to make sure that the existing YAMLs that don't have bodyPattern (because it's a newly introduced field) can be applied. can we just remove minLength here?"+deprecatedrather than remove it ("Why don't we add BodyPattern and ExcludeBodyPatterns and mark BodyContains deprecated here so that I can remove BodyContains later?").AGENTS.md,agentconfig.yaml,kelos-workers.yaml,kelos-api-reviewer.yaml(two new bullets under "API compatibility and evolution": one on validation tightening / kind changes, one on sweeping stale YAMLs).5. Docs must match implementation, not aspiration
GenericSourcebranch ininternal/webhook/handler.gonever reads<SOURCE>_WEBHOOK_SECRET, inspectsX-Hub-Signature-256, or callsvalidateHMACSignature— giving users a false sense of security.PodOverrides.Enventries reusing reserved names are dropped so the built-in always wins, but the actual filter only drops collisions against names already populated onmainContainer.Env.<@USER_ID>mentions are stripped before matchingexcludePatterns, butstripLeadingMentionswas never implemented — anchored regexes silently failed against mention-prefixed messages.AGENTS.md,agentconfig.yaml,kelos-workers.yaml,kelos-reviewer.yaml(new "Documentation accuracy" checklist subsection).6. TaskSpawner conventions
{{.Branch}}for issue events; designissue_commentprompts for both issue and PR contexts. PR Add API contract validation example (example 12) #974 — P2: "Remove the manual PR branch checkout instruction; Kelos already handles PR branch checkout automatically."agentconfig.yaml,kelos-workers.yaml. README template-variable table corrected from "Usually PR head branch" to "PR head branch; empty for issue events".7. Don't use Gomega's global
Expect()insideEventuallypolling blocksGetTaskPhasehelper calledExpect(err).NotTo(HaveOccurred()), which propagates the panic and short-circuitsEventually's polling on the first transient API error instead of retrying. OtherWaitFor*helpers in the framework already guard against this by returning a zero-value on error. The fix was to introduceWaitForTaskPhasethat inlines the API call and returns""on error.AGENTS.md,agentconfig.yaml,kelos-workers.yaml,kelos-reviewer.yaml(new bullet under the reviewer's Tests checklist).8.
/kind cleanupforself-development/-only PRs (new in latest commit)self-development/but used/kind bugbecause the change was framed as a recursion-prevention bug fix. CLAUDE.md/AGENTS.md (line 34) already states the rule, but it was not echoed into the agent'sagentsMD(where the other PR conventions are duplicated) nor in the reviewer's project-conventions checklist, so the worker missed it at PR-creation time and the kelos-reviewer pass did not flag it before merge.agentconfig.yaml,kelos-workers.yaml(echoed under Project Conventions);kelos-reviewer.yaml(new project-conventions checklist item with explicit P2 severity hint and the framing "classify by file location, not by problem nature").9. CLI error messages must name the resource (new in latest commit)
internal/cli/run.go:371: "The error string\"task failed\"omits the task name, so when Cobra prints the error in a pipeline (Error: task failed) it is ambiguous — especially if tooling invokeskelos runfor multiple named tasks. Includingnamegives operators an actionable signal." The fixfmt.Errorf("task %s failed", name)was accepted.AGENTS.md,agentconfig.yaml,kelos-workers.yaml,kelos-reviewer.yaml(new bullet under the reviewer's Code quality checklist).10. Test the happy path, not only early-return guards (new in latest commit)
kelos-bot[bot](P3) andgreptile-apps[bot]oninternal/slack/handler_test.go:352.kelos-bot: "Both new tests exercise only the early-return branches (non-bot user, empty message). The success path — thatPostMessageContextis invoked withevt.Channelandh.joinMessagewhen the bot itself joins — has no coverage. … Injecting a small posting seam (interface/func field) or a fake HTTP transport forgoslack.Clientwould let a positive test verify the call args." Sharpens the existing "always try to add or improve tests" rule by being specific: the agent has a recurring pattern of writing only the no-op branch tests because they're trivial without dependency injection, and skipping the happy path because the production code lacks a seam. The rule now requires building the seam.AGENTS.md,agentconfig.yaml,kelos-workers.yaml,kelos-reviewer.yaml(new bullet under the reviewer's Tests checklist).11. Avoid vacuous substring assertions in printer/formatter tests (new in latest commit)
internal/cli/printer_test.go:619: "The\"sentry\"check passes vacuously because the fixture'sNameis\"sentry-fixer\", which already contains the substring. If theWebhook Sourceline were accidentally omitted, this assertion would still pass, giving a false sense of coverage. A combined check for the full label + value string is the correct way to verify the field is present with the right value." The agent maintains a wide family ofprintTaskSpawner*,printTask*,printWorkspace*tests that all usestrings.Contains(output, ...)and is at risk of the same trap (Name contains the value) recurring.AGENTS.md,agentconfig.yaml,kelos-workers.yaml,kelos-reviewer.yaml(new bullet under the reviewer's Tests checklist).Also fixes the
kelos-reviewerTaskSpawner branch field to use the safe{{with index . "Branch"}}{{.}}{{else}}main{{end}}fallback form instead of bare{{.Branch}}. Flagged P2 by Greptile in this PR's prior review (#786): the spawner listens onissue_comment, so a/kelos reviewtriggered on a plain issue would land on whatever default state Kelos falls back to rather thanmain.Aligns the documented Branch fallback form with the canonical form (addresses Greptile P2 on this PR): the convention text in
agentconfig.yaml:41andkelos-workers.yaml:40previously prescribed{{if .Branch}}{{.Branch}}{{else}}main{{end}}, but the spawner YAMLs all use{{with index . "Branch"}}{{.}}{{else}}main{{end}}. The new "Docs must match implementation" rule applies to itself — updated the documented example to match the deployed form.Sweeps the convention into
kelos-squash-commits.yaml(addresses second Greptile P2 on this PR): the prior review noted thatkelos-squash-commits.yamllistens onissue_commentwithoutcommentOn: PullRequestand uses bare{{.Branch}}, so/kelos squash-commitstyped on a plain issue would fire the spawner with empty{{.Branch}}— and the prompt then runsgit push --force-with-lease, which is unsafe to execute against an arbitrary branch fallback. Fixed by addingcommentOn: PullRequeston the issue_comment filter, matching the pattern inkelos-pr-responder.yaml.Which issue(s) this PR is related to:
N/A
Special notes for your reviewer:
This PR has been rebased onto current main. All changes are documentation/configuration; no runtime code changes.
Does this PR introduce a user-facing change?