feat(auto): persist source-tagged auto answer log by shaun0927 · Pull Request #720 · Q00/ouroboros

shaun0927 · 2026-05-07T04:44:43Z

Summary

Closes the last #639 surface gap: "Interview progress shows round count and source-tagged auto answer summaries."

Records each auto-answered interview round on AutoPipelineState.auto_answer_log as a small dict (round, AutoAnswerSource tag, truncated question/answer). ooo auto --status renders the most recent five entries so users can see why the auto pipeline made each answer choice without scraping `~/.ouroboros/data/auto_*.json`.

Changes

src/ouroboros/auto/state.py
- New field auto_answer_log: list[dict[str, Any]] = field(default_factory=list).
- from_dict backfills missing field to [] for legacy persisted sessions.
- _validate_loaded extends the existing list-of-dicts check (previously only findings) to cover auto_answer_log.
src/ouroboros/auto/interview_driver.py
- New module-level helper _record_auto_answer(state, *, round_number, source, question, answer) appends to the log right after each successful answer is selected and capped at the last 25 entries.
- _truncate(text, limit) collapses whitespace and trims long Q/A text to 200 chars.
- Constants _AUTO_ANSWER_LOG_LIMIT = 25 and _AUTO_ANSWER_LOG_TEXT_LIMIT = 200.
src/ouroboros/cli/commands/auto.py
- _print_status renders `Recent auto answers (last N):` followed by `round X \[source] Q: ...` / `A: ...` lines, when the log is non-empty. The leading [ is escaped so Rich treats [source] as literal text.
tests/unit/auto/test_answer_source_log.py (new):
- Default empty list, round-trip persistence, legacy hydration, non-list rejection.
- _record_auto_answer truncation and ring-buffer eviction.
- _truncate whitespace collapsing.
- End-to-end CLI `--status` rendering.
- CLI omits the `Recent auto answers` block when the log is empty.

Pre-flight: PR collision check

The other in-flight #675 work (#682 "add selected-driver answer metadata" and #683 "classify selected-driver answer risk") touches src/ouroboros/auto/answerer.py, src/ouroboros/auto/driver_answerer.py, src/ouroboros/auto/__init__.py, and tests/unit/auto/test_driver_answerer.py. This PR touches none of those files. PR-E intentionally records only the existing AutoAnswer.source enum tag rather than the richer AutoAnswerMetadata introduced by #682, so the two efforts are orthogonal and can land in either order. A future slice can extend the log entry shape with AutoAnswerMetadata fields once #682 / #683 land.

Validation

UV_CACHE_DIR=/tmp/uv-cache uv run ruff check src/ouroboros/auto/state.py src/ouroboros/auto/interview_driver.py src/ouroboros/cli/commands/auto.py tests/unit/auto/test_answer_source_log.py → passes.
UV_CACHE_DIR=/tmp/uv-cache uv run pytest tests/unit/auto -q → 208 passed.

Risks / Notes

Field is additive; legacy sessions hydrate cleanly with an empty log.
The bounded ring-buffer keeps the persisted JSON compact even on long-running sessions (worst case: 25 × ~500 bytes ≈ 13 KB).
The CLI rendering is intentionally conservative (last 5 entries) to avoid flooding terminal output. A future PR can add --show-answer-log for the full log if needed.

Refs #639

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Reviewing commit 642e158 for PR #720

Review record: 12ae5c42-2226-4ccd-8b87-e7c6c94d1acf

Blocking Findings

---|-----------|----------|---------|
| 1 | src/ouroboros/cli/commands/auto.py:288 | BLOCKING | The new status output prints question and answer through console.print() without escaping Rich markup. These fields come from interview/backend text and can legitimately contain [ / ]; when that happens Rich will interpret them as markup, which can corrupt the displayed text or raise a markup parse error and make ooo auto --status fail for persisted sessions that otherwise load correctly. |

Non-blocking Suggestions

None.

Design Notes

The persistence change is small and backwards-compatible, and bounding the saved log is the right shape for a resume-oriented state file. The main issue is that the new CLI surface treats stored interview text as trusted Rich markup instead of plain content.

Reviewed by ouroboros-agent[bot] via Codex deep analysis

Addresses ouroboros-agent[bot] review on Q00#720: - _print_status now wraps source / question / answer fields with rich.markup.escape() before passing them to console.print(). The persisted text comes straight from the interview backend and may legitimately contain "[" / "]" sequences. Without the escape Rich would interpret those as markup styles, either swallowing the text from the rendered output or raising a markup parse error and making ooo auto --resume <id> --status fail for otherwise-valid sessions. - New regression test feeds "Use [bold]uv[/] toolchain" + "Use uv [from existing setup]" through the status surface and asserts both literal bracketed segments survive verbatim. - Reformat tests/unit/auto/test_answer_source_log.py per repo ruff format policy. Refs Q00#639

Q00

Reviewed. Persisting the source-tagged answer log touches state, CLI, and tests, so it carries regression risk. CI is currently failing, so I cannot approve it yet.

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: APPROVE

Reviewing commit 46276ba for PR #720

Review record: 734ce955-a417-4311-bf1c-962edf74aa97

Blocking Findings

No in-scope blocking findings remained after policy filtering.

Non-blocking Suggestions

None.

Design Notes

The change is narrowly scoped and consistent with the existing durable-session design: answer provenance is recorded at the point of state mutation, persisted with backward-compatible hydration, and rendered defensively in the CLI to avoid Rich markup breakage from backend-provided text.

Recovery Notes

First recoverable review artifact generated from codex analysis log.

Reviewed by ouroboros-agent[bot] via Codex deep analysis

shaun0927 · 2026-05-07T05:48:11Z

@Q00 re-review ping.

This PR is the final scope-cleanup follow-up for #639 — it lands the source-tagged auto answer log that the issue's acceptance criteria ("Interview progress shows round count and source-tagged auto answer summaries") explicitly call for, but in a way that is intentionally orthogonal to the in-flight #682 / #683 `AutoAnswerMetadata` work so the two efforts can land in either order without conflict.

What changed since the first round of review

The original blocking finding was that the new CLI status surface printed the persisted question / answer text directly through Rich's markup parser, so any backend response that legitimately contained `[` / `]` could either be silently rewritten as a non-existent style or raise a markup parse error and break `ooo auto --resume --status`. The fix wraps the source / question / answer fields with `rich.markup.escape()` before printing, and a new regression test feeds a mixed-bracket payload (`Use [bold]uv[/] toolchain` + `Use uv [from existing setup]`) end-to-end through the status surface and asserts both literal bracketed segments survive verbatim.

The latest deep review (commit `46276ba`) accepted the change with no remaining blocking or non-blocking findings:

Verdict: APPROVE

The change is narrowly scoped and consistent with the existing durable-session design: answer provenance is recorded at the point of state mutation, persisted with backward-compatible hydration, and rendered defensively in the CLI to avoid Rich markup breakage from backend-provided text.

Why this is safe to merge

Additive on AutoPipelineState: `auto_answer_log` defaults to `[]`, hydrates legacy persisted sessions to `[]`, and is bounded to the last 25 entries (≈ 13 KB worst case), so existing sessions and consumers are unaffected.
Single producer surface: the log is appended only inside `AutoInterviewDriver` at the existing post-answer save site, with no churn to `auto/answerer.py`, `auto/driver_answerer.py`, `auto/init.py`, or `tests/unit/auto/test_driver_answerer.py` — the exact files feat(auto): add selected-driver answer metadata #682 / feat(auto): classify selected-driver answer risk #683 are touching — so PR-682/feat(auto): classify selected-driver answer risk #683 can land in either order.
CI green across Ruff lint + format, MyPy, and pytest on Python 3.12 / 3.13 / 3.14, plus Bridge TypeScript.
CLI is robust to arbitrary backend text, so a malformed interview reply cannot break the operator-facing status surface.

Refs #639. Ready for human review.

Q00

Reviewed across OS/UserLevel/Program boundaries, auto scope, and UX complexity. Approving: the answer-source log is additive diagnostic state, bounded in status output, and helps users audit auto-filled interview answers without changing the auto decision model or adding a new command path.

Addresses ouroboros-agent[bot] review on Q00#720: - _print_status now wraps source / question / answer fields with rich.markup.escape() before passing them to console.print(). The persisted text comes straight from the interview backend and may legitimately contain "[" / "]" sequences. Without the escape Rich would interpret those as markup styles, either swallowing the text from the rendered output or raising a markup parse error and making ooo auto --resume <id> --status fail for otherwise-valid sessions. - New regression test feeds "Use [bold]uv[/] toolchain" + "Use uv [from existing setup]" through the status surface and asserts both literal bracketed segments survive verbatim. - Reformat tests/unit/auto/test_answer_source_log.py per repo ruff format policy. Refs Q00#639

Records each auto-answered interview round on AutoPipelineState as a small dict carrying the round, AutoAnswerSource tag, and truncated question/answer text. The log is bounded to the last 25 entries so the persisted state file stays compact. ooo auto --status now renders the last five entries grouped by source tag so users can see why the auto pipeline made each answer choice without scraping ~/.ouroboros/data/auto_*.json. Refs Q00#639

Addresses ouroboros-agent[bot] review on Q00#720: - _print_status now wraps source / question / answer fields with rich.markup.escape() before passing them to console.print(). The persisted text comes straight from the interview backend and may legitimately contain "[" / "]" sequences. Without the escape Rich would interpret those as markup styles, either swallowing the text from the rendered output or raising a markup parse error and making ooo auto --resume <id> --status fail for otherwise-valid sessions. - New regression test feeds "Use [bold]uv[/] toolchain" + "Use uv [from existing setup]" through the status surface and asserts both literal bracketed segments survive verbatim. - Reformat tests/unit/auto/test_answer_source_log.py per repo ruff format policy. Refs Q00#639

ouroboros-agent Bot requested changes May 7, 2026

View reviewed changes

Q00 requested changes May 7, 2026

View reviewed changes

Q00 added OS Core engine, state machine, internal pipeline, and system-level behavior Provenance Source, origin, gateway, and audit provenance Interview Interview driver/session/timeout behavior labels May 7, 2026

ouroboros-agent Bot approved these changes May 7, 2026

View reviewed changes

Q00 approved these changes May 7, 2026

View reviewed changes

shaun0927 mentioned this pull request May 7, 2026

Expose live phase and interview progress for ooo auto sessions #639

Closed

shaun0927 force-pushed the feat/639-E-answer-source-ledger branch from 46276ba to 91086c6 Compare May 7, 2026 07:31

shaun0927 added 2 commits May 7, 2026 16:37

shaun0927 force-pushed the feat/639-E-answer-source-ledger branch from 91086c6 to 14866a4 Compare May 7, 2026 07:38

shaun0927 merged commit 3dd2163 into Q00:main May 7, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(auto): persist source-tagged auto answer log#720

feat(auto): persist source-tagged auto answer log#720
shaun0927 merged 2 commits intoQ00:mainfrom
shaun0927:feat/639-E-answer-source-ledger

shaun0927 commented May 7, 2026

Uh oh!

ouroboros-agent Bot left a comment

Uh oh!

Q00 left a comment •

edited

Loading

Uh oh!

ouroboros-agent Bot left a comment

Uh oh!

shaun0927 commented May 7, 2026

Uh oh!

Q00 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shaun0927 commented May 7, 2026

Summary

Changes

Pre-flight: PR collision check

Validation

Risks / Notes

Uh oh!

ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

Blocking Findings

Non-blocking Suggestions

Design Notes

Uh oh!

Q00 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

Blocking Findings

Non-blocking Suggestions

Design Notes

Recovery Notes

Uh oh!

shaun0927 commented May 7, 2026

What changed since the first round of review

Why this is safe to merge

Uh oh!

Q00 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Q00 left a comment •

edited

Loading