-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat(adversarial-review): make self-collect path multi-turn so it can actually work #328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
kingdoooo
wants to merge
19
commits into
openai:main
Choose a base branch
from
kingdoooo:feat/codex-self-collect-multiturn
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 14 commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
9046e47
test: add queue-driven fake-codex harness for multi-turn tests
kingdoooo 0e9b8e5
test: harden fake-codex harness close() and dedupe saves
kingdoooo 3c47c59
feat(adversarial-review): add investigate + finalize prompt templates
kingdoooo 53f9e75
feat(codex): add runAppServerInvestigation for two-phase reviews
kingdoooo 2708672
test: strengthen soft-error assertion to check error message
kingdoooo fbb2bc7
feat(adversarial-review): add investigate/finalize prompt builders
kingdoooo cfa0423
fix(adversarial-review): route self-collect through two-phase investi…
kingdoooo 8b25e7b
feat(adversarial-review): show banner when investigation is truncated
kingdoooo 93609a5
feat(adversarial-review): plumb --max-investigation-turns CLI flag
kingdoooo a32d1d1
docs(adversarial-review): document --max-investigation-turns
kingdoooo 5b767ff
test(commands): allow --max-investigation-turns in argument-hint asse…
kingdoooo 950329c
fix(codex): normalize runAppServerInvestigation status to numeric
kingdoooo 0981f6d
test(investigation): cover phase-2 finalize transport error path
kingdoooo 6992596
feat(adversarial-review): point to --max-investigation-turns in banner
kingdoooo 3de1895
fix(adversarial-review): tighten --max-investigation-turns validation
kingdoooo daac52f
fix(adversarial-review): converge on any 0-command turn with an agent…
kingdoooo a7a90c7
fix(adversarial-review): retry finalize turn if the model runs commands
kingdoooo 324836e
fix(adversarial-review): narrow inline-diff path to single-file reviews
kingdoooo d222542
fix(adversarial-review): force status=1 when investigation turn errors
kingdoooo File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,53 @@ | ||
| <role> | ||
| You have just completed an investigation of a code change. Now produce the structured adversarial review. | ||
| </role> | ||
|
|
||
| <task> | ||
| Based on your investigation in the prior turns of this thread, write up your findings as a structured review. | ||
| Target: {{TARGET_LABEL}} | ||
| User focus: {{USER_FOCUS}} | ||
| </task> | ||
|
|
||
| <finding_bar> | ||
| Report only material findings. | ||
| Do not include style feedback, naming feedback, low-value cleanup, or speculative concerns without evidence. | ||
| A finding should answer: | ||
| 1. What can go wrong? | ||
| 2. Why is this code path vulnerable? | ||
| 3. What is the likely impact? | ||
| 4. What concrete change would reduce the risk? | ||
| </finding_bar> | ||
|
|
||
| <structured_output_contract> | ||
| Return only valid JSON matching the provided schema. | ||
| Keep the output compact and specific. | ||
| Use `needs-attention` if there is any material risk worth blocking on. | ||
| Use `approve` only if you cannot support any substantive adversarial finding from your investigation. | ||
| Every finding must include: | ||
| - the affected file | ||
| - `line_start` and `line_end` | ||
| - a confidence score from 0 to 1 | ||
| - a concrete recommendation | ||
| Write the summary like a terse ship/no-ship assessment, not a neutral recap. | ||
| </structured_output_contract> | ||
|
|
||
| <grounding_rules> | ||
| Be aggressive, but stay grounded. | ||
| Every finding must be defensible from what you read during the investigation. | ||
| Do not invent files, lines, code paths, incidents, attack chains, or runtime behavior you cannot support. | ||
| If a conclusion depends on an inference, state that explicitly in the finding body and keep the confidence honest. | ||
| </grounding_rules> | ||
|
|
||
| <calibration_rules> | ||
| Prefer one strong finding over several weak ones. | ||
| Do not dilute serious issues with filler. | ||
| If the change looks safe, say so directly and return no findings. | ||
| </calibration_rules> | ||
|
|
||
| <final_check> | ||
| Before finalizing, check that each finding is: | ||
| - adversarial rather than stylistic | ||
| - tied to a concrete code location | ||
| - plausible under a real failure scenario | ||
| - actionable for an engineer fixing the issue | ||
| </final_check> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,48 @@ | ||
| <role> | ||
| You are Codex performing an adversarial software review. | ||
| Your job is to break confidence in the change, not to validate it. | ||
| This is the investigation phase: gather evidence with read-only commands before producing any structured output. | ||
| </role> | ||
|
|
||
| <task> | ||
| Investigate the change so you can later produce a confident adversarial assessment. | ||
| Target: {{TARGET_LABEL}} | ||
| User focus: {{USER_FOCUS}} | ||
| </task> | ||
|
|
||
| <operating_stance> | ||
| Default to skepticism. | ||
| Assume the change can fail in subtle, high-cost, or user-visible ways until the evidence says otherwise. | ||
| Do not give credit for good intent, partial fixes, or likely follow-up work. | ||
| If something only works on the happy path, treat that as a real weakness. | ||
| </operating_stance> | ||
|
|
||
| <attack_surface> | ||
| Prioritize the kinds of failures that are expensive, dangerous, or hard to detect: | ||
| - auth, permissions, tenant isolation, and trust boundaries | ||
| - data loss, corruption, duplication, and irreversible state changes | ||
| - rollback safety, retries, partial failure, and idempotency gaps | ||
| - race conditions, ordering assumptions, stale state, and re-entrancy | ||
| - empty-state, null, timeout, and degraded dependency behavior | ||
| - version skew, schema drift, migration hazards, and compatibility regressions | ||
| - observability gaps that would hide failure or make recovery harder | ||
| </attack_surface> | ||
|
|
||
| <investigation_method> | ||
| Use read-only shell commands to inspect the diff and the surrounding code. | ||
| Useful starting points: `git diff`, `git log`, `git show`, `git blame`, `cat`, `rg`/`grep`. | ||
| Read the changed files, follow references, and confirm or refute hypotheses with evidence from the code. | ||
| Do not modify any files. Your sandbox is read-only. | ||
| {{REVIEW_COLLECTION_GUIDANCE}} | ||
| </investigation_method> | ||
|
|
||
| <convergence> | ||
| Continue investigating until you can defend a confident adversarial assessment. | ||
| When you have seen enough, emit a brief summary message describing what you found and stop running commands. | ||
| A summary message with no further command calls signals that you are ready for the finalization phase. | ||
| Do not produce a structured review yet — that comes in the next phase. | ||
| </convergence> | ||
|
|
||
| <repository_context> | ||
| {{REVIEW_INPUT}} | ||
| </repository_context> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new
--max-investigation-turnsvalidation usesNumber.parseInt, which silently accepts malformed values like1.5(becomes1) or10abc(becomes10) even though the flag contract says this must be a positive integer. This can hide user typos and run significantly fewer or different investigation turns than requested, degrading review depth without any error signal. Please validate the full token (for example with a strict integer regex orNumber(...)plus an integer check) before accepting it.Useful? React with 👍 / 👎.