Skip to content

Prompt Tuning: kelos-self-update Prompt Tuning area uses gh pr/issue list defaults that hide merged/closed evidence #1047

@kelos-bot

Description

@kelos-bot

🤖 Kelos Self-Update Agent @gjkim42

Area: Prompt Tuning

Problem

The kelos-self-update Prompt Tuning area instructs the agent to "Compare prompts against actual agent behavior by checking recent PRs and issues created by agents" and provides two example commands:

gh pr list --label generated-by-kelos --limit 20
gh issue list --label generated-by-kelos --limit 20

Both gh pr list and gh issue list default to --state open. The whole point of comparing against past agent behavior is to see what was merged (validated) and what was closed without merge or resolved/closed (rejected). Following the prompt verbatim hides that evidence entirely.

Evidence

1. The commands today return only open items

Running the exact commands the prompt provides, against the live repo on 2026-04-29:

$ gh pr list --label generated-by-kelos --limit 20 --json state | jq -r '[.[].state] | unique'
["OPEN"]

$ gh pr list --label generated-by-kelos --state all --limit 20 --json state | jq -r '[.[].state] | unique'
["MERGED","OPEN"]

The default form returns 20 open PRs only. Adding --state all returns a mix of MERGED and OPEN — including all the recent worker/api-reviewer/image-update PRs that represent successful agent runs (e.g., #1031, #1030, #1027, #1023, #1022, #1013, #1009, #1006, #1005, #1004, #999, #988, #984, #978, #966, #965, #961).

The same holds for issues: the default form excludes CLOSED issues, so the agent does not see #538, #848, #856, #905, #908, #935, #944, #1032 — all of which represent closed feedback loops worth learning from.

2. The intent in the prompt is "past agent runs"

The prompt's surrounding wording (lines 75 and 78) is unambiguous about wanting historical data:

- Compare prompts against actual agent behavior by checking recent PRs and issues created by agents
- Propose improved prompt wording backed by evidence from past agent runs

"Past agent runs" — the strongest signal there is whether the run produced a merged PR, a closed-without-merge PR, or an issue that ended up resolved/closed. All three are hidden by --state open.

3. Other prompts in this directory get the state filter right

Compare with peer prompts:

File Command State filter
kelos-self-update.yaml line 76 (Prompt Tuning area) gh pr list --label generated-by-kelos --limit 20 default = open
kelos-self-update.yaml line 76 (Prompt Tuning area) gh issue list --label generated-by-kelos --limit 20 default = open
kelos-self-update.yaml line 110 (dedup pre-check) gh issue list --label generated-by-kelos --state open --limit 50 --json number,title explicit open ✓ (correct for dedup)
kelos-config-update.yaml line 51 gh pr list --state merged --limit 20 --json number,title,labels,mergedAt explicit merged
kelos-config-update.yaml line 52 gh pr list --state all --limit 20 --json number,title,labels explicit all
kelos-fake-strategist.yaml line 100 (dedup pre-check) gh issue list --label generated-by-kelos --state open --limit 50 --json number,title explicit open ✓ (correct for dedup)
kelos-fake-user.yaml line 95 (dedup pre-check) gh issue list --label generated-by-kelos --state open --limit 50 --json number,title explicit open ✓ (correct for dedup)

kelos-config-update.yaml is the right pattern: it explicitly asks for --state merged and --state all precisely because it is doing the same kind of "look at past PRs to learn" analysis. kelos-self-update.yaml's Prompt Tuning area is doing the same kind of analysis but does not pass through that lesson.

4. Two distinct purposes, two different state filters needed in this same prompt

The prompt has two places that list issues, with different intents:

  • Line 76 (Prompt Tuning data gathering) — needs --state all (or merged/closed for PRs) so the agent can see what was accepted vs rejected.
  • Line 110 (dedup pre-check) — correctly uses --state open, since duplicates of closed issues are not duplicates we care about.

Today both default to open, conflating the two purposes.

Proposed Fix

In self-development/kelos-self-update.yaml, change the Prompt Tuning area's example commands (line 76):

   - Compare prompts against actual agent behavior by checking recent PRs and issues created by agents
     (`gh pr list --label generated-by-kelos --state all --limit 20 --json number,title,state,mergedAt`,
      `gh issue list --label generated-by-kelos --state all --limit 20 --json number,title,state,closedAt`)

Two narrow changes:

  1. Add --state all so MERGED/CLOSED items become visible.
  2. Add --json number,title,state,... so the output is structured and the agent can directly see whether each item was merged, closed, or open without re-fetching.

Leave line 110 (--state open for dedup) unchanged — it is doing the correct thing for its purpose.

Scope

  • One file: self-development/kelos-self-update.yaml
  • One line edit (line 76)
  • No prompt-flow change, just makes the existing instruction execute against the data the prompt already says it wants

Not covered by existing issues

I checked open issues against kelos-self-update.yaml:

No existing open issue targets the data-gathering commands in the kelos-self-update.yaml Prompt Tuning area.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions