Conversation
When a deterministic answer would land on CONSERVATIVE_DEFAULT or ASSUMPTION for a high-risk topic that has no defensible generic default, upgrade it to a BLOCKER instead of silently committing the auto Seed to a fabricated stance. Targeted topics: - regulated personal data (PII, GDPR, HIPAA, SOX, PCI-DSS) - destructive bulk schema/table operations (truncate/purge tables/schemas) Existing safe-allowlists keep working: product-feature questions about credentials/branches and the explicit `_blocker_for` patterns are unchanged. REPO_FACT/USER_GOAL-backed answers also pass through without gating. Refs Q00#640
Address PR Q00#695 blocking review finding: The post-routing gate keyed off broad keywords, so meta-questions like "What acceptance criteria should the HIPAA worker satisfy?" or "Which command output verifies the GDPR export flow?" were rejected as if they asked for regulated-data handling decisions. They actually hit the `_feature_acceptance_answer` / `_verification_answer` routes which return safe templates regardless of subject keywords. Restructure `answer()` so the gate only fires after generative routes (actor/IO, runtime, product behavior, default). Meta-routes (non-goal listing, verification, feature acceptance) return early without going through the gate. Add regression coverage for HIPAA/GDPR/PII acceptance and verification phrasings to ensure they keep returning CONSERVATIVE_DEFAULT answers. Existing PII/HIPAA generative-route block tests continue to pass. Refs Q00#640
Address PR Q00#695 follow-up: a regulated runtime question without a supplied repo_fact, e.g. "Which runtime should the HIPAA worker use?", routes through `_runtime_answer` and returns `AutoAnswerSource.EXISTING_CONVENTION` with the generic "use the existing repository runtime" template. Because EXISTING_CONVENTION was not in `_RISKY_FALLBACK_SOURCES`, that fallback escaped the gate. Add EXISTING_CONVENTION to the risky-fallback set. REPO_FACT-backed runtime answers (full `runtime_context` supplied) remain unaffected, and the existing `does_not_block_regulated_topic_when_repo_fact_supplied` test continues to assert that REPO_FACT answers pass through. Add a regression test that covers the bot's exact case: HIPAA runtime question with no supplied facts now blocks with reason "regulated data handling". Refs Q00#640
Address PR Q00#695 follow-up: the destructive-operation matcher only caught ``verb ... noun`` phrasings such as ``purge tables``, so reverse phrasings like ``Which tables should the migration truncate?`` slipped through. The verb vocabulary was also narrow. Expand patterns: - Verbs: ``truncate``, ``purge``, ``wipe`` plus tense variants (``truncates``/``truncating``/``truncated`` etc.). - Nouns: ``table(s)``, ``schema(s)``, ``database(s)``, ``index/indexes/indices``, ``migration(s)``. - Both verb-then-noun and noun-then-verb directions matched. Note: ``drop ... database`` remains owned by ``_blocker_for`` (its existing branch fires first), and product-feature questions are still exempted by the safe-product allowlists, so this does not over-gate benign feature semantics. Add a regression test exercising both phrasing directions across the new verb/noun vocabulary. Refs Q00#640
Address PR Q00#695 follow-up: phrasings like "Which tables should the migration drop?" or "Should we erase these schemas before re-seeding?" still flowed through to a generic auto answer because ``drop`` and ``erase`` were missing from `_DESTRUCTIVE_BULK_VERBS`. `_blocker_for` already handles ``drop|delete|erase|wipe ... database`` at the explicit-authority layer, so this matcher and the existing allow/deny list both treat the same families consistently. The risky-fallback gate runs after `_blocker_for`, so explicit authority-style prompts continue to block via the original code path. Add ``drop`` and ``erase`` (with tense variants) and extend the regression test to assert both verb-then-noun and noun-then-verb phrasings using the new vocabulary. Refs Q00#640
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Reviewing commit
326563cfor PR #738
Review record:
dcffd5cd-c11a-4eeb-9135-908557c3bbdb
Blocking Findings
| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:716 | BLOCKING | The new risky-fallback gate blocks every fallback/product-behavior answer that merely mentions PII, HIPAA, GDPR, etc., even when the question is only asking for product semantics rather than compliance policy. For example, Should the app export PII reports? or Should users be able to download GDPR exports? now return BLOCKER instead of preserving the requested feature behavior. That is a regression in normal auto-answer coverage, and there is no allowlist/test for regulated-data product questions comparable to the credential-specific safe path. |
Follow-up Findings
src/ouroboros/auto/answerer.py:666[warning]_DESTRUCTIVE_BULK_NOUNSnow includes baremigration/migrationsandindex/indexes, and the regexes at lines 685/690 treat any pairing withdrop/wipe/eraseas a destructive bulk-data operation. That overblocks innocuous questions such asWhich migration should we drop from the release plan?orWhich indexes should we drop from the docs?, which are not schema/data-destruction requests. The matcher needs tighter context than noun+verb anywhere in the sentence.
Non-blocking Suggestions
None.
Design Notes
The route ordering is sensible, and gating only fallback-style answers is the right general shape. The issue is the new regex layer is broader than the intent described in the docstrings, so it now overrides legitimate product-semantics questions instead of only catching unsafe generic defaults.
Reviewed by ouroboros-agent[bot] via Codex deep analysis
…#640) Add _DESTRUCTIVE_BULK_NON_DATA_QUALIFIERS to exempt verb/noun pairs that appear with a non-data artefact qualifier (release plan, docs, roadmap, etc.) from the destructive-bulk blocker. Also extend _DESTRUCTIVE_BULK_NOUNS with record/row/audit-log/audit-trail strong data-object nouns. Addresses ouroboros-agent[bot] follow-up warning on Q00#738: bare ``migration`` + ``drop`` and ``index`` + ``drop`` questions about release plans or documentation were overblocked. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…ate (Q00#640) Add _is_safe_product_regulated_question() allowlist that passes through bounded product-behavior questions mentioning regulated nouns (PII/GDPR/ HIPAA/SOX/PCI-DSS) when paired with a product-semantics verb (export, download, display, show, view, access, …) and NOT a compliance-policy verb (store, handle, retain, collect, encrypt, …). Questions like "Should the app export PII reports?" or "Should users be able to download GDPR exports?" are feature-level requirements and must not be blocked; questions asking how to store/handle/retain regulated data still block as before. Addresses ouroboros-agent[bot] BLOCKING on Q00#738 (answerer.py:716). Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
|
Addressed both bot CHANGES_REQUESTED items in two follow-up commits: BLOCKING ( Confirmed: Follow-up warning ( Confirmed: New tests (all passing, 66 total):
|
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Reviewing commit
a98eb10for PR #738
Review record:
f9cda100-5968-4e6d-8549-5d6f504b3ad0
Blocking Findings
| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:741 | BLOCKING | _is_safe_product_regulated_question() is broader than _is_product_behavior_question(), but answer() still only sends product questions to _product_behavior_answer() via the older matcher at lines 122-125. In practice, several newly “allowed” cases like Should users be able to download GDPR exports?, Should admins be able to view PII fields in the admin panel?, and Should the app allow users to access their GDPR data? no longer block, but they fall through to _default_answer() instead of preserving the requested feature semantics. That silently drops the regulated-product requirement from the ledger/acceptance criteria, which is a wrong auto-answer outcome rather than a safe pass-through. |
Non-blocking Suggestions
None.
Design Notes
The new post-routing risk gate is a sensible shape for catching unsafe generic fallbacks, but the new allowlist is not aligned with the existing routing classifier. That leaves a gap where “safe” regulated product questions are unblocked without being handled semantically.
Reviewed by ouroboros-agent[bot] via Codex deep analysis
…avior answerer (Q00#640) Extend _is_product_behavior_question() with a new arm covering the product-semantics verbs used by _is_safe_product_regulated_question() that were not previously matched (download, allow, expose, render, enable, support) and the "be able to <verb>" phrasing gap for view/access. Previously, questions allowed past the risky-fallback gate (e.g. "Should users be able to download GDPR exports?") fell through to _default_answer(), producing a generic conservative-MVP ledger entry that silently discarded the regulated-product feature semantics. With this fix the router at answerer.py:122 sends those questions to _product_behavior_answer(), which writes subject-specific constraints.behavior.* and acceptance.behavior.* ledger entries that preserve the requested feature in the Seed contract. New test: test_auto_answerer_routes_safe_regulated_product_questions_to_product_behavior_answerer asserts blocker=None, source=CONSERVATIVE_DEFAULT, subject-specific ledger keys (not conservative_mvp), and regulated noun present in answer text/ledger entries for all three bot example questions. Addresses ouroboros-agent[bot] BLOCKING on Q00#738 (answerer.py:741). Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
|
Addressed BLOCKING on #738 ( Root cause confirmed: Fix: Extended or re.search(
r"\b(should|must|can|will|do|does|is|are)\b.+\b(be able to\s+)?"
r"(download|allow|expose|render|enable|support|view|access)\b",
lowered,
)This covers the product-semantics verbs used by the allowlist that were not previously matched, and fixes the Verified end-to-end: All three questions now:
New test: 67 tests passing (66 previous + 1 new). Ruff clean. |
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Reviewing commit
99c4580for PR #738
Review record:
80809b43-2705-46e8-91f3-228ede4f7574
Blocking Findings
| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:688 | BLOCKING | _DESTRUCTIVE_BULK_NON_DATA_QUALIFIERS now exempts any destructive bulk question that merely contains the word documentation, even if the question is still asking for a real schema/data deletion. For example, Which tables should we drop according to the documentation before redeploying? matches the destructive tables ... drop pattern, but the bare documentation token suppresses the blocker and lets auto-answering proceed. That is a regression in the safety gate for destructive operations; the qualifier needs to be tied to clearly non-destructive phrasings such as from the docs/documentation, not any mention of documentation anywhere in the sentence. |
Non-blocking Suggestions
None.
Design Notes
The PR is directionally correct: it adds a second-stage safety gate for risky generative fallbacks and then restores intended routing for safe regulated product-behavior questions. The main architectural weakness is that the destructive-operation exemption is implemented as a broad keyword bypass rather than a narrowly scoped phrase check, which makes the safety boundary too easy to evade.
Reviewed by ouroboros-agent[bot] via Codex deep analysis
Previously the destructive-bulk exemption matched on bare tokens such as
``documentation`` or ``release plan`` anywhere in the sentence, which let
real destructive operations slip past the gate when the question merely
*referenced* documentation as an authority (e.g. "Which tables should we
drop according to the documentation before redeploying?").
The qualifier is now strictly phrase-scoped to ``from the …`` so the
exemption fires only when the artefact is the explicit object of the
drop/wipe — the phrasing that signals "remove an entry from a process
artefact" rather than "delete data from a system". Authority/reference
phrasings ("according to the documentation", "per the release plan",
"in the documentation example") no longer suppress the gate.
Existing pass-through tests still hold:
- "Which migration should we drop from the release plan?" → not blocked
- "Which indexes should we drop from the docs?" → not blocked
New regression test locks the safety boundary:
- "Which tables should we drop according to the documentation …" → BLOCKER
- "Which tables should we drop per the release plan?" → BLOCKER
- "Per the documentation, which audit logs should we purge?" → BLOCKER
- "According to the docs, which tables should we drop?" → BLOCKER
Ref: ouroboros-agent[bot] BLOCKING on PR Q00#738 — answerer.py:688.
68 tests passing in test_ledger_grading_answerer.py (337 in tests/unit/auto). Ruff clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Reviewing commit
fc11788for PR #738
Review record:
c633fd73-4dda-43d8-80e5-b99ca2b37ee9
Blocking Findings
| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:698 | BLOCKING | The new destructive-bulk exemption only matches from the ..., so the same safe process-artifact questions phrased as in the ... still get blocked as data-destruction. For example, Which indexes should we drop in the docs? and Which migration should we drop in the roadmap? still satisfy the destructive noun/verb patterns and miss the qualifier entirely. That is a wrong-result regression in the exact false-positive area this patch is trying to fix. |
| 2 | src/ouroboros/auto/answerer.py:782 | BLOCKING | _is_safe_product_regulated_question() disables the regulated-product allowlist whenever any compliance-policy verb appears anywhere in the sentence. That over-blocks legitimate feature-semantics questions about already-existing regulated data, e.g. Should admins be able to view stored PII fields? or Should the dashboard display encrypted HIPAA files? Both are product-behavior questions, but stored/encrypted makes them fall through to the risky-fallback blocker and return regulated data handling. |
Non-blocking Suggestions
None.
Design Notes
The post-routing risky-fallback gate is a reasonable direction and the added tests improve coverage for earlier false positives. The remaining problems are both matcher-shape issues: the new exemptions are still too literal in one place and too broad in another, so the design needs slightly more context-sensitive patterning before this is safe.
Reviewed by ouroboros-agent[bot] via Codex deep analysis
…erbs (Q00#640) Two BLOCKING regressions raised by ouroboros-agent[bot] on the previous fix commit (fc11788): 1) **answerer.py:698** — destructive-bulk exemption only matched ``from the …`` so safe process-artefact phrasings such as ``Which indexes should we drop in the docs?`` and ``Which migration should we drop in the roadmap?`` were still mis-blocked as data destruction. The qualifier now also accepts ``in the …`` for the same artefact list (``release plan``, ``docs``, ``documentation``, ``plan``, ``roadmap``, ``backlog``, ``changelog``, ``spec``). Authority/reference phrasings (``according to the docs``, ``per the release plan``) still do not match the qualifier and remain blocked, locked in by an expanded regression test. 2) **answerer.py:782** — ``_is_safe_product_regulated_question()`` rejected any question containing a compliance-policy verb (``store``, ``handle``, ``encrypt``, ``share``, …) anywhere in the sentence. That over-blocked legitimate product-behavior questions where the compliance verb appeared as a past-participle adjective modifying the noun, e.g. Should admins be able to view stored PII fields? Should the dashboard display encrypted HIPAA files? In both, the main verb is product-semantics (``view`` / ``display``); the compliance verb is adjectival. The allowlist now requires (a) a regulated noun, (b) a product-question modal, and (c) a product-semantics verb. Pure compliance-policy phrasings (``How should the system handle GDPR data retention?``, ``What PII should the system collect?``) lack a product-semantics verb and remain blocked — covered by the existing ``test_auto_answerer_still_blocks_compliance_policy_regulated_questions``. The previously-defined ``_COMPLIANCE_POLICY_VERBS_RE`` constant is now unused and removed to avoid dead code. New regression coverage: - ``test_auto_answerer_allows_in_the_artefact_drop_questions`` — locks in ``in the docs/roadmap/release plan/changelog`` exemption. - ``test_auto_answerer_allows_product_questions_with_adjectival_compliance_verbs`` — locks in ``view stored PII``, ``display encrypted HIPAA files``, ``download retained GDPR exports``, etc. Existing safety tests (compliance-policy questions, authority-reference phrasings) all continue to block. 339 unit tests passing in tests/unit/auto. Ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Reviewing commit
e846a47for PR #738
Review record:
75181131-a43b-4cd5-9d16-a6525991c0cc
Blocking Findings
| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:750 | BLOCKING | _is_safe_product_regulated_question() is now too broad: it allows any regulated-data question that contains a modal plus one “product” verb, even when the same sentence also asks for compliance-policy decisions. For example, How should the system store and display HIPAA files? or Should we retain and export PII records? will pass this allowlist, skip _risky_fallback_blocker_for(), and be auto-answered via _product_behavior_answer() despite still requiring a decision about regulated-data storage/retention. That is exactly the class of question this gate is supposed to block. The current tests only cover pure product questions and pure compliance questions, so this mixed case regression is not caught. |
Non-blocking Suggestions
None.
Design Notes
The routing cleanup is sensible and the new tests cover the prior review threads well, but the regulated-topic allowlist currently relies on positive keyword presence alone. That makes the safety boundary fragile for mixed-intent questions; this path needs an explicit negative check or precedence rule for compliance-policy verbs.
Reviewed by ouroboros-agent[bot] via Codex deep analysis
…cedence (Q00#640) Bot follow-up on commit e846a47: the regulated-product allowlist was too permissive. Mixed-intent questions that pair a compliance-policy verb with a product-semantics verb — e.g. How should the system store and display HIPAA files? Should we retain and export PII records? still ask the auto pipeline to decide regulated-data handling and must remain blocked, even though they also mention a product verb. The fix adds an explicit precedence rule: an *active*-form compliance-policy verb (``store`` / ``stores`` / ``storing``, ``retain`` / ``retains`` / ``retaining``, ``encrypt``, ``handle``, ``collect``, ``share``, ``transmit``, ``disclose``, ``process``, ``manage``, ``govern``) blocks the question even if a product-semantics verb is also present. Past-participle forms (``stored``, ``encrypted``, ``retained``, ``collected``, ``shared``, …) are intentionally excluded from the negative list because they act adjectivally on the regulated noun (``view stored PII``, ``display encrypted HIPAA files``); the main verb of those sentences is the product-semantics one and the question is product-behavior over already- existing regulated data, not a compliance-policy decision. New regression test ``test_auto_answerer_blocks_mixed_intent_regulated_questions`` locks the precedence rule on the bot's own examples plus three more variants covering ``encrypt`` / ``share`` / ``collect``. Existing positive tests (adjectival compliance verbs, pure product semantics) and existing negative tests (pure compliance phrasings) all continue to pass. 340 unit tests passing in tests/unit/auto. Ruff clean. Ref: ouroboros-agent[bot] BLOCKING on Q00#738 — ``answerer.py:750``. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Reviewing commit
7ed761cfor PR #738
Review record:
e97aeaa5-4f22-424b-91cf-b76ecc77ef3f
Blocking Findings
| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:548 | BLOCKING | The router fix is still incomplete for two verbs that _is_safe_product_regulated_question() explicitly allows: export and show are in _PRODUCT_SEMANTICS_REGULATED_VERBS_RE but not in the new _is_product_behavior_question() branch. As a result, questions like Should the app export PII reports? or Should the app show PII reports? bypass the blocker and then silently fall through to _default_answer() instead of _product_behavior_answer(), losing the regulated-feature semantics in the ledger. This is the same class of bug the follow-up commit fixed for download/view/access, and it remains user-visible for the remaining allowed verbs. |
Non-blocking Suggestions
| 1 | tests/unit/auto/test_ledger_grading_answerer.py:1200 | Nice-to-have tests | The regulated-topic allowlist test only checks not BLOCKER. Adding route/ledger-shape assertions for export and show would have caught the remaining mismatch with _is_product_behavior_question(). |
Design Notes
The overall gating approach is reasonable: explicit hard blockers first, then a second-pass guard for risky generic fallbacks. The remaining issue is contract drift between the safe-allowlist and the router heuristics; those two verb vocabularies need to stay aligned or the pipeline preserves safety while still dropping intended product semantics.
Reviewed by ouroboros-agent[bot] via Codex deep analysis
Bot follow-up on commit 7ed761c: ``_PRODUCT_SEMANTICS_REGULATED_VERBS_RE`` allows ``export`` and ``show`` (and the rest of the safe-allowlist set), but the explicit alignment branch in ``_is_product_behavior_question()`` only listed a subset (``download/allow/expose/render/enable/support/view/access``). ``export`` / ``show`` / ``display`` were still matched by an earlier broader pattern in the same function, but the visible alignment was incomplete and prone to silent drift. The router branch added to bridge ``_is_safe_product_regulated_question()`` into ``_is_product_behavior_question()`` now lists every verb in the allowlist: export | download | render | display | show | expose | support | enable | allow | view | access This is a no-op for already-routed verbs but makes the allowlist↔router contract explicit and grep-checkable, eliminating the drift surface flagged in the bot's design note. Test changes: ``test_auto_answerer_routes_safe_regulated_product_questions_to_product_behavior_answerer`` now exercises every verb in the allowlist (export, show, display, render, expose, support, enable in addition to download/view/access). Each case asserts (a) the gate passes (``answer.blocker is None``), (b) the router takes the product-behavior path (``constraints.behavior.*`` and ``acceptance.behavior.*`` ledger keys, not the generic ``constraints.conservative_mvp`` from ``_default_answer()``), and (c) the regulated noun is preserved in the answer text or ledger value. 340 unit tests passing in tests/unit/auto. Ruff clean. Ref: ouroboros-agent[bot] BLOCKING on Q00#738 — ``answerer.py:548``. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Reviewing commit
52ef5abfor PR #738
Review record:
a69ddee7-e88d-4d17-adee-25cd24c2af55
Blocking Findings
| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:837 | BLOCKING | _is_safe_product_regulated_question() now suppresses the regulated-data blocker for any question that happens to mention a regulated noun plus a product verb, even when the router sends that question down a different path first. Because answer() checks _is_actor_or_io_question() and _is_runtime_context_question() before _is_product_behavior_question() (lines 118-123), prompts like What inputs should the GDPR export take? or Which runtime should the GDPR export use? will bypass the new blocker and get a generic IO/runtime answer instead of a blocker or a feature-specific product answer. That reintroduces the same “safe allowlist broader than routing” bug in another form, and the new tests only cover the default/product-behavior path. |
Non-blocking Suggestions
None.
Design Notes
The risky-fallback gate is a reasonable place for this policy, but the new regulated-data allowlist is route-agnostic while the router is not. The bypass needs to be tied to the actual selected answer path, or the router needs to classify regulated product questions before the generic IO/runtime branches.
Reviewed by ouroboros-agent[bot] via Codex deep analysis
…es (Q00#640) Bot follow-up on commit 52ef5ab: ``_is_safe_product_regulated_question()`` suppressed the risky-fallback blocker for any regulated-noun + product-verb combination, but the router checked ``_is_actor_or_io_question`` and ``_is_runtime_context_question`` *before* ``_is_product_behavior_question``, so prompts like What inputs should the GDPR export take? Which runtime should the GDPR export use? got a generic IO/runtime answer (``ASSUMPTION`` / ``EXISTING_CONVENTION``) and then bypassed the blocker via the safe-allowlist — silently dropping the regulated-feature semantics from the ledger. Fix: pull ``_is_safe_product_regulated_question`` to the top of the content-routing chain so any regulated-product question — IO-shaped, runtime-shaped, or product-shaped — is dispatched to ``_product_behavior_answer()``. The risky-fallback gate at the tail of ``answer()`` already consults the same predicate, so the router and the safe-allowlist now share a single answer path. Pure compliance phrasings remain blocked unchanged: they fail the allowlist (no product-semantics verb) and fall through to the previous branches, where the risky-fallback gate fires for any ``CONSERVATIVE_DEFAULT`` / ``ASSUMPTION`` / ``EXISTING_CONVENTION`` source. New regression test ``test_auto_answerer_routes_regulated_product_questions_before_io_or_runtime`` locks in: - Bot's example "What inputs should the GDPR export take?" - Bot's example "Which runtime should the GDPR export use?" - Two adjacent IO/runtime regulated-product variants Each case asserts (a) not blocked, (b) answer comes from ``_product_behavior_answer()`` (subject-specific ``behavior.*`` ledger keys, no IO/runtime keys), and (c) the regulated noun is preserved in the answer text or ledger value. 341 unit tests passing in tests/unit/auto. Ruff clean. The two failures in tests/unit/orchestrator/test_codex_cli_runtime.py are pre-existing and unrelated to this PR's scope (verified by stashing the patch). Ref: ouroboros-agent[bot] BLOCKING on Q00#738 — ``answerer.py:837``. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Reviewing commit
44e405ffor PR #738
Review record:
a6525e8d-fc68-402c-859b-00e2586a68c4
Blocking Findings
| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:126 | BLOCKING | The new _is_safe_product_regulated_question() branch runs before _is_runtime_context_question() unconditionally, so regulated runtime questions that also contain a product noun now ignore concrete repo facts. For example, answer("Which runtime should the GDPR export use?", ..., context=AutoAnswerContext(repo_facts={"runtime_context": ...})) will always return _product_behavior_answer() instead of the REPO_FACT runtime answer, dropping grounded runtime_context evidence and replacing it with a generic behavior entry. That is a behavioral regression from the existing runtime contract and should be covered by a test. |
Non-blocking Suggestions
None.
Design Notes
The risky-fallback gate and the destructive-bulk qualifier tightening are reasonable, but the router is now conflating “regulated product question” with “always answer as product behavior.” That coupling is too strong; the safe-allowlist and the answer-route choice need to preserve higher-priority grounded facts such as runtime_context.
Reviewed by ouroboros-agent[bot] via Codex deep analysis
Q00#640) Bot follow-up on commit 44e405f: the unconditional early route to ``_product_behavior_answer()`` for any question that ``_is_safe_product_regulated_question()`` recognised broke the existing runtime contract. With a supplied ``runtime_context`` repo fact, a question like ``Which runtime should the GDPR export use?`` should return a ``REPO_FACT`` runtime answer carrying the grounded evidence; the early route replaced it with a generic product-behavior entry, dropping the evidence. Restructure: keep the original IO/runtime/product/default order so grounded ``REPO_FACT`` answers stay on the runtime path, then re-route to ``_product_behavior_answer()`` only when the chosen route produced a non-grounded fallback (``ASSUMPTION`` / ``EXISTING_CONVENTION`` / ``CONSERVATIVE_DEFAULT``) AND the safe-allowlist recognises the question as regulated-product. Concretely: - ``Which runtime should the GDPR export use?`` + REPO_FACT → REPO_FACT runtime answer (preserved, with evidence). - ``Which runtime should the GDPR export use?`` without repo facts → EXISTING_CONVENTION runtime fallback re-routed through ``_product_behavior_answer()`` so the regulated-feature semantics are preserved in ``constraints.behavior.*`` / ``acceptance.behavior.*``. - ``What inputs should the GDPR export take?`` → IO ASSUMPTION re-routed to ``_product_behavior_answer()``. - ``Should the app export PII reports?`` → already routes through ``_product_behavior_answer()`` (CONSERVATIVE_DEFAULT) and is left untouched by the reroute. Pure compliance phrasings still block: they fail the allowlist (no product-semantics verb), keep their CONSERVATIVE_DEFAULT/ASSUMPTION/ EXISTING_CONVENTION source, and the risky-fallback gate fires for them. New regression test ``test_auto_answerer_preserves_repo_fact_for_regulated_runtime_question`` locks the REPO_FACT preservation contract: with a runtime_context repo fact supplied, the answer must be REPO_FACT, must contain the supplied runtime text, and must carry a runtime_context ledger entry with the supplied evidence. 342 unit tests passing in tests/unit/auto. Ruff clean. Ref: ouroboros-agent[bot] BLOCKING on Q00#738 — ``answerer.py:126``. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
The previous commit (046ce3d) accidentally included an empty local debug artifact via ``git add -A``. Removing it; not part of the PR scope. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Reviewing commit
a13fd6cfor PR #738
Review record:
d5f1de56-27c8-458e-9db9-a539c5ac7d4b
Blocking Findings
| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:793 | BLOCKING | _is_safe_product_regulated_question() now treats any regulated noun plus verbs like support, enable, or allow as a safe product-behavior question. That lets broad compliance-scope prompts such as Should the platform support HIPAA? or Should the app enable GDPR? bypass the blocker and produce a generated MVP answer, even though those are still regulated-policy decisions rather than bounded feature semantics. The allowlist needs an additional constraint so it only passes concrete artifact/feature questions, not bare “support HIPAA/GDPR” requests. |
| 2 | src/ouroboros/auto/answerer.py:718 | BLOCKING | The destructive-operation exemption regex matches `from |
Non-blocking Suggestions
None.
Design Notes
The routing change is directionally right: it separates grounded answers from risky fallbacks and preserves regulated-product semantics better than the previous default-path behavior. The remaining issue is that both new regex allowlists are still too lexical, so a few high-risk meanings now slip through despite the added tests.
Reviewed by ouroboros-agent[bot] via Codex deep analysis
Two BLOCKING items raised by ouroboros-agent[bot] on commit a13fd6c: (1) ``answerer.py:793`` — ``_is_safe_product_regulated_question()`` allowed "compliance-scope-as-feature-flag" prompts (``Should the platform support HIPAA?``, ``Should the app enable GDPR?``, ``Should the system allow PII?``) to bypass the blocker, even though those frame the entire regulatory regime as a binary toggle and are compliance-policy decisions. Fix: a new ``_BARE_COMPLIANCE_SCOPE_RE`` rejects ``support|enable|allow`` + bare regulated noun followed by no qualifying feature noun (negative lookahead ``(?!\s+[a-z])``). Concrete-feature variants ("HIPAA audit logs", "GDPR consent banners", "PII redaction in exports", "GDPR data") have a qualifying noun and still pass through. (2) ``answerer.py:718`` — the destructive-bulk artefact qualifier listed standalone ``doc`` and ``plan`` tokens. ``from the doc`` is rare phrasing (use ``docs`` / ``documentation``) and bare ``plan`` collides with database-side meanings (query plan, execution plan, db plan), so a question like "Which tables should we drop from the plan?" was being exempted as a process-artefact edit. Fix: drop ``doc`` and ``plan`` (singular) from the artefact list. The remaining unambiguous artefacts are ``release plan``, ``docs``, ``documentation``, ``roadmap``, ``backlog``, ``changelog``, ``spec``. All existing positive tests already use these unambiguous variants. New regression coverage: - ``test_auto_answerer_blocks_bare_compliance_scope_questions`` — locks rejection of bare ``support|enable|allow + regulated noun`` for all five regulated-noun variants. - ``test_auto_answerer_allows_qualified_compliance_scope_questions`` — locks pass-through of ``support HIPAA audit logs`` / ``enable GDPR consent banners`` / ``allow PII redaction`` / etc. - ``test_auto_answerer_blocks_destructive_bulk_with_ambiguous_singular_tokens`` — locks blocker for ``from the plan`` / ``in the plan`` / ``from the doc`` destructive prompts. 345 unit tests passing in tests/unit/auto. Ruff clean. Ref: ouroboros-agent[bot] BLOCKING on Q00#738 — ``answerer.py:793`` and ``answerer.py:718``. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: APPROVE
Reviewing commit
bd8cd7cfor PR #738
Review record:
65d72a77-6bfa-4d58-adee-5a1d07b88eab
Blocking Findings
No in-scope blocking findings remained after policy filtering.
Non-blocking Suggestions
None.
Design Notes
The router change is coherent: it preserves grounded REPO_FACT answers, re-routes only risky fallback sources, and the added tests cover the previously-missed regulated product-semantics paths plus the destructive-bulk qualifier edge cases. I did not find a remaining diff-scoped correctness issue that warrants blocking.
Recovery Notes
First recoverable review artifact generated from codex analysis log.
Reviewed by ouroboros-agent[bot] via Codex deep analysis
|
@Q00 — re-review ping. ouroboros-agent[bot] APPROVED the latest head ( What this PR doesA redo of #695 (which you previously approved as "narrows automation for regulated/destructive topics, reduces scope risk"). Adds a second-stage risky-fallback gate in
The gate fires only on generative answer routes ( Bot iteration timeline (all resolved)The bot raised eight successive blocking concerns; each was addressed by an in-scope follow-up commit and re-verified by the next bot pass:
The bot's final approval explicitly acknowledges: "The router change is coherent: it preserves grounded REPO_FACT answers, re-routes only risky fallback sources, and the added tests cover the previously-missed regulated product-semantics paths plus the destructive-bulk qualifier edge cases." Why it's safe to merge
Ready for your final pass. |
Summary
Re-do of #695. @Q00 previously APPROVED the original PR ("narrows automation for regulated/destructive topics, reduces scope risk") but
ouroboros-agent[bot]kept REQUEST_CHANGES on the risky-fallback gate logic. The follow-up commits in this branch address the bot's feedback:drop,erase, ...).Behavior
drop/truncate/delete-all/wipe/erase/ ...) → block generative fallback.Test plan
pytest tests/unit/auto/test_ledger_grading_answerer.py(covers regulated-topic gate, destructive-bulk gate, scope-gating, reverse phrasing).Refs #640 (closing criteria 2/2 — paired with the provenance meta-only PR for the first half)