feat(auto): block risky-fallback answers for regulated topics#695
feat(auto): block risky-fallback answers for regulated topics#695
Conversation
When a deterministic answer would land on CONSERVATIVE_DEFAULT or ASSUMPTION for a high-risk topic that has no defensible generic default, upgrade it to a BLOCKER instead of silently committing the auto Seed to a fabricated stance. Targeted topics: - regulated personal data (PII, GDPR, HIPAA, SOX, PCI-DSS) - destructive bulk schema/table operations (truncate/purge tables/schemas) Existing safe-allowlists keep working: product-feature questions about credentials/branches and the explicit `_blocker_for` patterns are unchanged. REPO_FACT/USER_GOAL-backed answers also pass through without gating. Refs Q00#640
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Reviewing commit
6944525for PR #695
Review record:
91c7e9e2-a3f6-4466-b71c-f40cebb2e5c2
Blocking Findings
| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:127 | BLOCKING | The new post-routing blocker turns any CONSERVATIVE_DEFAULT/ASSUMPTION answer into a hard stop whenever the question merely mentions HIPAA, GDPR, PII, etc. Because this runs after _is_feature_acceptance_question() and _is_verification_question(), benign prompts like What acceptance criteria should the HIPAA worker satisfy? or Which command output verifies the GDPR export flow? now return BLOCKER instead of the existing feature/verification guidance. That is a behavioral regression: these questions do not ask the model to decide regulated-data handling, but the broad keyword match in _RISKY_FALLBACK_PATTERNS treats them as if they do. |
Non-blocking Suggestions
None.
Design Notes
The change is directionally reasonable: blocking generic fallbacks for genuinely high-risk topics is safer than inventing defaults. The issue is that the current implementation keys off broad keywords after answer routing, so it catches safe meta-questions as well as the risky ones.
Reviewed by ouroboros-agent[bot] via Codex deep analysis
Address PR Q00#695 blocking review finding: The post-routing gate keyed off broad keywords, so meta-questions like "What acceptance criteria should the HIPAA worker satisfy?" or "Which command output verifies the GDPR export flow?" were rejected as if they asked for regulated-data handling decisions. They actually hit the `_feature_acceptance_answer` / `_verification_answer` routes which return safe templates regardless of subject keywords. Restructure `answer()` so the gate only fires after generative routes (actor/IO, runtime, product behavior, default). Meta-routes (non-goal listing, verification, feature acceptance) return early without going through the gate. Add regression coverage for HIPAA/GDPR/PII acceptance and verification phrasings to ensure they keep returning CONSERVATIVE_DEFAULT answers. Existing PII/HIPAA generative-route block tests continue to pass. Refs Q00#640
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Reviewing commit
0dc7134for PR #695
Review record:
e629351a-e803-4b69-9b75-183cf6155272
Blocking Findings
| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:127 | BLOCKING | The new risky-fallback gate does not actually cover generic runtime answers. answer() routes Which runtime should the HIPAA worker use? into _runtime_answer(), and when no runtime_context fact is supplied that path returns AutoAnswerSource.EXISTING_CONVENTION, not one of _RISKY_FALLBACK_SOURCES. As a result, regulated runtime questions still get the generic “use the existing repository runtime...” answer instead of blocking, which contradicts the new policy and leaves a high-risk fallback path open. |
Non-blocking Suggestions
None.
Design Notes
The change is directionally sound: it separates explicit hard blockers from softer generic-answer routes. The main issue is that the enforcement key (answer.source) does not align with the documented route-level policy, so one fallback class still escapes the gate.
Reviewed by ouroboros-agent[bot] via Codex deep analysis
Address PR Q00#695 follow-up: a regulated runtime question without a supplied repo_fact, e.g. "Which runtime should the HIPAA worker use?", routes through `_runtime_answer` and returns `AutoAnswerSource.EXISTING_CONVENTION` with the generic "use the existing repository runtime" template. Because EXISTING_CONVENTION was not in `_RISKY_FALLBACK_SOURCES`, that fallback escaped the gate. Add EXISTING_CONVENTION to the risky-fallback set. REPO_FACT-backed runtime answers (full `runtime_context` supplied) remain unaffected, and the existing `does_not_block_regulated_topic_when_repo_fact_supplied` test continues to assert that REPO_FACT answers pass through. Add a regression test that covers the bot's exact case: HIPAA runtime question with no supplied facts now blocks with reason "regulated data handling". Refs Q00#640
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Reviewing commit
bee8570for PR #695
Review record:
4487f146-f834-41ad-91ef-de76a8de6793
Blocking Findings
| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:659 | BLOCKING | The new destructive-operation gate is too narrow and order-dependent. It only matches `truncate |
Non-blocking Suggestions
None.
Design Notes
The routing change is localized and the added tests cover the intended happy paths, but the safety gate currently relies on brittle keyword regexes. For high-risk topics, that matcher needs broader verb coverage and symmetric phrasing support to be dependable.
Reviewed by ouroboros-agent[bot] via Codex deep analysis
Address PR Q00#695 follow-up: the destructive-operation matcher only caught ``verb ... noun`` phrasings such as ``purge tables``, so reverse phrasings like ``Which tables should the migration truncate?`` slipped through. The verb vocabulary was also narrow. Expand patterns: - Verbs: ``truncate``, ``purge``, ``wipe`` plus tense variants (``truncates``/``truncating``/``truncated`` etc.). - Nouns: ``table(s)``, ``schema(s)``, ``database(s)``, ``index/indexes/indices``, ``migration(s)``. - Both verb-then-noun and noun-then-verb directions matched. Note: ``drop ... database`` remains owned by ``_blocker_for`` (its existing branch fires first), and product-feature questions are still exempted by the safe-product allowlists, so this does not over-gate benign feature semantics. Add a regression test exercising both phrasing directions across the new verb/noun vocabulary. Refs Q00#640
Q00
left a comment
There was a problem hiding this comment.
Reviewed across OS/UserLevel/Program boundaries, auto scope, and UX complexity. Approving: the risky-fallback gate narrows automation for regulated/destructive topics, which reduces scope risk rather than expanding it.
There was a problem hiding this comment.
Review — ouroboros-agent[bot]
Verdict: REQUEST_CHANGES
Reviewing commit
007e254for PR #695
Review record:
b2b2bb66-208d-4029-8a96-385ad5a6ea68
Blocking Findings
| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.
---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:659 | BLOCKING | The new destructive-bulk fallback gate still misses common destructive verbs such as drop and erase. After this change, questions like Which tables should the migration drop? or Should we erase these schemas? will still flow through to a generic auto answer instead of blocking, even though the older _blocker_for policy already treated those verbs as destructive in related contexts. The added test coverage at tests/unit/auto/test_ledger_grading_answerer.py:1069 only exercises truncate/purge/wipe, so this regression path remains untested. |
Non-blocking Suggestions
None.
Design Notes
The routing change is directionally sound: it preserves explicit repo-fact answers while preventing generic fallbacks on higher-risk topics. The main weakness is that the new regex gate is now a separate policy surface from _blocker_for, so vocabulary drift between the two lists can leave obvious holes unless they are kept aligned.
Reviewed by ouroboros-agent[bot] via Codex deep analysis
|
Cross-linking from #689 (closed as out-of-scope for core). Surfacing a boundary concern, not asking for an immediate decision: this PR places the regulated-topic vocabulary ( The surface — block instead of silently fabricating a
Possible reframings (any of these would address the concern without losing the safety property):
Happy to take any of those directions — flagging now so the boundary call is explicit before merge rather than litigated later. |
|
Following up on #725 ("Design a UserLevel plugin manager for operational workflows"): recommend HOLD on this PR until #725 v0 contract lands. Reasons:
The gate (block instead of fabricate when the answerer would otherwise return Concrete suggested redesign (after #725 v0):
Happy to split this PR into the core gate + the default skill once the v0 manifest lands. Until then, holding so the contract isn't pre-committed to. |
Address PR Q00#695 follow-up: phrasings like "Which tables should the migration drop?" or "Should we erase these schemas before re-seeding?" still flowed through to a generic auto answer because ``drop`` and ``erase`` were missing from `_DESTRUCTIVE_BULK_VERBS`. `_blocker_for` already handles ``drop|delete|erase|wipe ... database`` at the explicit-authority layer, so this matcher and the existing allow/deny list both treat the same families consistently. The risky-fallback gate runs after `_blocker_for`, so explicit authority-style prompts continue to block via the original code path. Add ``drop`` and ``erase`` (with tense variants) and extend the regression test to assert both verb-then-noun and noun-then-verb phrasings using the new vocabulary. Refs Q00#640
|
Closing this PR per the boundary established in #689 / #725 v0 discussion. The intent is correct (block instead of fabricate for risky topics) and will return after #725 v0 lands as two cleaner pieces:
That split keeps the safety property (block instead of silently fabricate) without committing core to a specific regional/regulatory vocabulary. Tests in this PR ( Refs #725. |
Address PR Q00#695 blocking review finding: The post-routing gate keyed off broad keywords, so meta-questions like "What acceptance criteria should the HIPAA worker satisfy?" or "Which command output verifies the GDPR export flow?" were rejected as if they asked for regulated-data handling decisions. They actually hit the `_feature_acceptance_answer` / `_verification_answer` routes which return safe templates regardless of subject keywords. Restructure `answer()` so the gate only fires after generative routes (actor/IO, runtime, product behavior, default). Meta-routes (non-goal listing, verification, feature acceptance) return early without going through the gate. Add regression coverage for HIPAA/GDPR/PII acceptance and verification phrasings to ensure they keep returning CONSERVATIVE_DEFAULT answers. Existing PII/HIPAA generative-route block tests continue to pass. Refs Q00#640
Address PR Q00#695 follow-up: a regulated runtime question without a supplied repo_fact, e.g. "Which runtime should the HIPAA worker use?", routes through `_runtime_answer` and returns `AutoAnswerSource.EXISTING_CONVENTION` with the generic "use the existing repository runtime" template. Because EXISTING_CONVENTION was not in `_RISKY_FALLBACK_SOURCES`, that fallback escaped the gate. Add EXISTING_CONVENTION to the risky-fallback set. REPO_FACT-backed runtime answers (full `runtime_context` supplied) remain unaffected, and the existing `does_not_block_regulated_topic_when_repo_fact_supplied` test continues to assert that REPO_FACT answers pass through. Add a regression test that covers the bot's exact case: HIPAA runtime question with no supplied facts now blocks with reason "regulated data handling". Refs Q00#640
Address PR Q00#695 follow-up: the destructive-operation matcher only caught ``verb ... noun`` phrasings such as ``purge tables``, so reverse phrasings like ``Which tables should the migration truncate?`` slipped through. The verb vocabulary was also narrow. Expand patterns: - Verbs: ``truncate``, ``purge``, ``wipe`` plus tense variants (``truncates``/``truncating``/``truncated`` etc.). - Nouns: ``table(s)``, ``schema(s)``, ``database(s)``, ``index/indexes/indices``, ``migration(s)``. - Both verb-then-noun and noun-then-verb directions matched. Note: ``drop ... database`` remains owned by ``_blocker_for`` (its existing branch fires first), and product-feature questions are still exempted by the safe-product allowlists, so this does not over-gate benign feature semantics. Add a regression test exercising both phrasing directions across the new verb/noun vocabulary. Refs Q00#640
Address PR Q00#695 follow-up: phrasings like "Which tables should the migration drop?" or "Should we erase these schemas before re-seeding?" still flowed through to a generic auto answer because ``drop`` and ``erase`` were missing from `_DESTRUCTIVE_BULK_VERBS`. `_blocker_for` already handles ``drop|delete|erase|wipe ... database`` at the explicit-authority layer, so this matcher and the existing allow/deny list both treat the same families consistently. The risky-fallback gate runs after `_blocker_for`, so explicit authority-style prompts continue to block via the original code path. Add ``drop`` and ``erase`` (with tense variants) and extend the regression test to assert both verb-then-noun and noun-then-verb phrasings using the new vocabulary. Refs Q00#640
…pics (#640) (#738) * feat(auto): block risky-fallback answers for regulated topics When a deterministic answer would land on CONSERVATIVE_DEFAULT or ASSUMPTION for a high-risk topic that has no defensible generic default, upgrade it to a BLOCKER instead of silently committing the auto Seed to a fabricated stance. Targeted topics: - regulated personal data (PII, GDPR, HIPAA, SOX, PCI-DSS) - destructive bulk schema/table operations (truncate/purge tables/schemas) Existing safe-allowlists keep working: product-feature questions about credentials/branches and the explicit `_blocker_for` patterns are unchanged. REPO_FACT/USER_GOAL-backed answers also pass through without gating. Refs #640 * fix(auto): scope risky-fallback gate to generative answer routes only Address PR #695 blocking review finding: The post-routing gate keyed off broad keywords, so meta-questions like "What acceptance criteria should the HIPAA worker satisfy?" or "Which command output verifies the GDPR export flow?" were rejected as if they asked for regulated-data handling decisions. They actually hit the `_feature_acceptance_answer` / `_verification_answer` routes which return safe templates regardless of subject keywords. Restructure `answer()` so the gate only fires after generative routes (actor/IO, runtime, product behavior, default). Meta-routes (non-goal listing, verification, feature acceptance) return early without going through the gate. Add regression coverage for HIPAA/GDPR/PII acceptance and verification phrasings to ensure they keep returning CONSERVATIVE_DEFAULT answers. Existing PII/HIPAA generative-route block tests continue to pass. Refs #640 * fix(auto): include EXISTING_CONVENTION runtime fallback in risky gate Address PR #695 follow-up: a regulated runtime question without a supplied repo_fact, e.g. "Which runtime should the HIPAA worker use?", routes through `_runtime_answer` and returns `AutoAnswerSource.EXISTING_CONVENTION` with the generic "use the existing repository runtime" template. Because EXISTING_CONVENTION was not in `_RISKY_FALLBACK_SOURCES`, that fallback escaped the gate. Add EXISTING_CONVENTION to the risky-fallback set. REPO_FACT-backed runtime answers (full `runtime_context` supplied) remain unaffected, and the existing `does_not_block_regulated_topic_when_repo_fact_supplied` test continues to assert that REPO_FACT answers pass through. Add a regression test that covers the bot's exact case: HIPAA runtime question with no supplied facts now blocks with reason "regulated data handling". Refs #640 * fix(auto): broaden destructive-bulk patterns and cover reverse phrasing Address PR #695 follow-up: the destructive-operation matcher only caught ``verb ... noun`` phrasings such as ``purge tables``, so reverse phrasings like ``Which tables should the migration truncate?`` slipped through. The verb vocabulary was also narrow. Expand patterns: - Verbs: ``truncate``, ``purge``, ``wipe`` plus tense variants (``truncates``/``truncating``/``truncated`` etc.). - Nouns: ``table(s)``, ``schema(s)``, ``database(s)``, ``index/indexes/indices``, ``migration(s)``. - Both verb-then-noun and noun-then-verb directions matched. Note: ``drop ... database`` remains owned by ``_blocker_for`` (its existing branch fires first), and product-feature questions are still exempted by the safe-product allowlists, so this does not over-gate benign feature semantics. Add a regression test exercising both phrasing directions across the new verb/noun vocabulary. Refs #640 * chore: drop stray local debug artifact accidentally committed in previous fix * fix(auto): add drop and erase to destructive-bulk verb vocabulary Address PR #695 follow-up: phrasings like "Which tables should the migration drop?" or "Should we erase these schemas before re-seeding?" still flowed through to a generic auto answer because ``drop`` and ``erase`` were missing from `_DESTRUCTIVE_BULK_VERBS`. `_blocker_for` already handles ``drop|delete|erase|wipe ... database`` at the explicit-authority layer, so this matcher and the existing allow/deny list both treat the same families consistently. The risky-fallback gate runs after `_blocker_for`, so explicit authority-style prompts continue to block via the original code path. Add ``drop`` and ``erase`` (with tense variants) and extend the regression test to assert both verb-then-noun and noun-then-verb phrasings using the new vocabulary. Refs #640 * fix(auto): require schema/data context for destructive-bulk gate (#640) Add _DESTRUCTIVE_BULK_NON_DATA_QUALIFIERS to exempt verb/noun pairs that appear with a non-data artefact qualifier (release plan, docs, roadmap, etc.) from the destructive-bulk blocker. Also extend _DESTRUCTIVE_BULK_NOUNS with record/row/audit-log/audit-trail strong data-object nouns. Addresses ouroboros-agent[bot] follow-up warning on #738: bare ``migration`` + ``drop`` and ``index`` + ``drop`` questions about release plans or documentation were overblocked. Co-Authored-By: Claude Sonnet 4.6 <[email protected]> * fix(auto): allow product-semantics questions through risky-fallback gate (#640) Add _is_safe_product_regulated_question() allowlist that passes through bounded product-behavior questions mentioning regulated nouns (PII/GDPR/ HIPAA/SOX/PCI-DSS) when paired with a product-semantics verb (export, download, display, show, view, access, …) and NOT a compliance-policy verb (store, handle, retain, collect, encrypt, …). Questions like "Should the app export PII reports?" or "Should users be able to download GDPR exports?" are feature-level requirements and must not be blocked; questions asking how to store/handle/retain regulated data still block as before. Addresses ouroboros-agent[bot] BLOCKING on #738 (answerer.py:716). Co-Authored-By: Claude Sonnet 4.6 <[email protected]> * fix(auto): route safe regulated-product questions through product-behavior answerer (#640) Extend _is_product_behavior_question() with a new arm covering the product-semantics verbs used by _is_safe_product_regulated_question() that were not previously matched (download, allow, expose, render, enable, support) and the "be able to <verb>" phrasing gap for view/access. Previously, questions allowed past the risky-fallback gate (e.g. "Should users be able to download GDPR exports?") fell through to _default_answer(), producing a generic conservative-MVP ledger entry that silently discarded the regulated-product feature semantics. With this fix the router at answerer.py:122 sends those questions to _product_behavior_answer(), which writes subject-specific constraints.behavior.* and acceptance.behavior.* ledger entries that preserve the requested feature in the Seed contract. New test: test_auto_answerer_routes_safe_regulated_product_questions_to_product_behavior_answerer asserts blocker=None, source=CONSERVATIVE_DEFAULT, subject-specific ledger keys (not conservative_mvp), and regulated noun present in answer text/ledger entries for all three bot example questions. Addresses ouroboros-agent[bot] BLOCKING on #738 (answerer.py:741). Co-Authored-By: Claude Sonnet 4.6 <[email protected]> * fix(auto): phrase-scope destructive-bulk non-data qualifier (#640) Previously the destructive-bulk exemption matched on bare tokens such as ``documentation`` or ``release plan`` anywhere in the sentence, which let real destructive operations slip past the gate when the question merely *referenced* documentation as an authority (e.g. "Which tables should we drop according to the documentation before redeploying?"). The qualifier is now strictly phrase-scoped to ``from the …`` so the exemption fires only when the artefact is the explicit object of the drop/wipe — the phrasing that signals "remove an entry from a process artefact" rather than "delete data from a system". Authority/reference phrasings ("according to the documentation", "per the release plan", "in the documentation example") no longer suppress the gate. Existing pass-through tests still hold: - "Which migration should we drop from the release plan?" → not blocked - "Which indexes should we drop from the docs?" → not blocked New regression test locks the safety boundary: - "Which tables should we drop according to the documentation …" → BLOCKER - "Which tables should we drop per the release plan?" → BLOCKER - "Per the documentation, which audit logs should we purge?" → BLOCKER - "According to the docs, which tables should we drop?" → BLOCKER Ref: ouroboros-agent[bot] BLOCKING on PR #738 — answerer.py:688. 68 tests passing in test_ledger_grading_answerer.py (337 in tests/unit/auto). Ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * fix(auto): widen artefact qualifier and trust adjectival compliance verbs (#640) Two BLOCKING regressions raised by ouroboros-agent[bot] on the previous fix commit (fc11788): 1) **answerer.py:698** — destructive-bulk exemption only matched ``from the …`` so safe process-artefact phrasings such as ``Which indexes should we drop in the docs?`` and ``Which migration should we drop in the roadmap?`` were still mis-blocked as data destruction. The qualifier now also accepts ``in the …`` for the same artefact list (``release plan``, ``docs``, ``documentation``, ``plan``, ``roadmap``, ``backlog``, ``changelog``, ``spec``). Authority/reference phrasings (``according to the docs``, ``per the release plan``) still do not match the qualifier and remain blocked, locked in by an expanded regression test. 2) **answerer.py:782** — ``_is_safe_product_regulated_question()`` rejected any question containing a compliance-policy verb (``store``, ``handle``, ``encrypt``, ``share``, …) anywhere in the sentence. That over-blocked legitimate product-behavior questions where the compliance verb appeared as a past-participle adjective modifying the noun, e.g. Should admins be able to view stored PII fields? Should the dashboard display encrypted HIPAA files? In both, the main verb is product-semantics (``view`` / ``display``); the compliance verb is adjectival. The allowlist now requires (a) a regulated noun, (b) a product-question modal, and (c) a product-semantics verb. Pure compliance-policy phrasings (``How should the system handle GDPR data retention?``, ``What PII should the system collect?``) lack a product-semantics verb and remain blocked — covered by the existing ``test_auto_answerer_still_blocks_compliance_policy_regulated_questions``. The previously-defined ``_COMPLIANCE_POLICY_VERBS_RE`` constant is now unused and removed to avoid dead code. New regression coverage: - ``test_auto_answerer_allows_in_the_artefact_drop_questions`` — locks in ``in the docs/roadmap/release plan/changelog`` exemption. - ``test_auto_answerer_allows_product_questions_with_adjectival_compliance_verbs`` — locks in ``view stored PII``, ``display encrypted HIPAA files``, ``download retained GDPR exports``, etc. Existing safety tests (compliance-policy questions, authority-reference phrasings) all continue to block. 339 unit tests passing in tests/unit/auto. Ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * fix(auto): block mixed-intent regulated questions via active-verb precedence (#640) Bot follow-up on commit e846a47: the regulated-product allowlist was too permissive. Mixed-intent questions that pair a compliance-policy verb with a product-semantics verb — e.g. How should the system store and display HIPAA files? Should we retain and export PII records? still ask the auto pipeline to decide regulated-data handling and must remain blocked, even though they also mention a product verb. The fix adds an explicit precedence rule: an *active*-form compliance-policy verb (``store`` / ``stores`` / ``storing``, ``retain`` / ``retains`` / ``retaining``, ``encrypt``, ``handle``, ``collect``, ``share``, ``transmit``, ``disclose``, ``process``, ``manage``, ``govern``) blocks the question even if a product-semantics verb is also present. Past-participle forms (``stored``, ``encrypted``, ``retained``, ``collected``, ``shared``, …) are intentionally excluded from the negative list because they act adjectivally on the regulated noun (``view stored PII``, ``display encrypted HIPAA files``); the main verb of those sentences is the product-semantics one and the question is product-behavior over already- existing regulated data, not a compliance-policy decision. New regression test ``test_auto_answerer_blocks_mixed_intent_regulated_questions`` locks the precedence rule on the bot's own examples plus three more variants covering ``encrypt`` / ``share`` / ``collect``. Existing positive tests (adjectival compliance verbs, pure product semantics) and existing negative tests (pure compliance phrasings) all continue to pass. 340 unit tests passing in tests/unit/auto. Ruff clean. Ref: ouroboros-agent[bot] BLOCKING on #738 — ``answerer.py:750``. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * fix(auto): align router verb list with regulated allowlist (#640) Bot follow-up on commit 7ed761c: ``_PRODUCT_SEMANTICS_REGULATED_VERBS_RE`` allows ``export`` and ``show`` (and the rest of the safe-allowlist set), but the explicit alignment branch in ``_is_product_behavior_question()`` only listed a subset (``download/allow/expose/render/enable/support/view/access``). ``export`` / ``show`` / ``display`` were still matched by an earlier broader pattern in the same function, but the visible alignment was incomplete and prone to silent drift. The router branch added to bridge ``_is_safe_product_regulated_question()`` into ``_is_product_behavior_question()`` now lists every verb in the allowlist: export | download | render | display | show | expose | support | enable | allow | view | access This is a no-op for already-routed verbs but makes the allowlist↔router contract explicit and grep-checkable, eliminating the drift surface flagged in the bot's design note. Test changes: ``test_auto_answerer_routes_safe_regulated_product_questions_to_product_behavior_answerer`` now exercises every verb in the allowlist (export, show, display, render, expose, support, enable in addition to download/view/access). Each case asserts (a) the gate passes (``answer.blocker is None``), (b) the router takes the product-behavior path (``constraints.behavior.*`` and ``acceptance.behavior.*`` ledger keys, not the generic ``constraints.conservative_mvp`` from ``_default_answer()``), and (c) the regulated noun is preserved in the answer text or ledger value. 340 unit tests passing in tests/unit/auto. Ruff clean. Ref: ouroboros-agent[bot] BLOCKING on #738 — ``answerer.py:548``. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * fix(auto): route regulated-product questions before IO/runtime branches (#640) Bot follow-up on commit 52ef5ab: ``_is_safe_product_regulated_question()`` suppressed the risky-fallback blocker for any regulated-noun + product-verb combination, but the router checked ``_is_actor_or_io_question`` and ``_is_runtime_context_question`` *before* ``_is_product_behavior_question``, so prompts like What inputs should the GDPR export take? Which runtime should the GDPR export use? got a generic IO/runtime answer (``ASSUMPTION`` / ``EXISTING_CONVENTION``) and then bypassed the blocker via the safe-allowlist — silently dropping the regulated-feature semantics from the ledger. Fix: pull ``_is_safe_product_regulated_question`` to the top of the content-routing chain so any regulated-product question — IO-shaped, runtime-shaped, or product-shaped — is dispatched to ``_product_behavior_answer()``. The risky-fallback gate at the tail of ``answer()`` already consults the same predicate, so the router and the safe-allowlist now share a single answer path. Pure compliance phrasings remain blocked unchanged: they fail the allowlist (no product-semantics verb) and fall through to the previous branches, where the risky-fallback gate fires for any ``CONSERVATIVE_DEFAULT`` / ``ASSUMPTION`` / ``EXISTING_CONVENTION`` source. New regression test ``test_auto_answerer_routes_regulated_product_questions_before_io_or_runtime`` locks in: - Bot's example "What inputs should the GDPR export take?" - Bot's example "Which runtime should the GDPR export use?" - Two adjacent IO/runtime regulated-product variants Each case asserts (a) not blocked, (b) answer comes from ``_product_behavior_answer()`` (subject-specific ``behavior.*`` ledger keys, no IO/runtime keys), and (c) the regulated noun is preserved in the answer text or ledger value. 341 unit tests passing in tests/unit/auto. Ruff clean. The two failures in tests/unit/orchestrator/test_codex_cli_runtime.py are pre-existing and unrelated to this PR's scope (verified by stashing the patch). Ref: ouroboros-agent[bot] BLOCKING on #738 — ``answerer.py:837``. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * fix(auto): preserve grounded REPO_FACT for regulated-runtime questions (#640) Bot follow-up on commit 44e405f: the unconditional early route to ``_product_behavior_answer()`` for any question that ``_is_safe_product_regulated_question()`` recognised broke the existing runtime contract. With a supplied ``runtime_context`` repo fact, a question like ``Which runtime should the GDPR export use?`` should return a ``REPO_FACT`` runtime answer carrying the grounded evidence; the early route replaced it with a generic product-behavior entry, dropping the evidence. Restructure: keep the original IO/runtime/product/default order so grounded ``REPO_FACT`` answers stay on the runtime path, then re-route to ``_product_behavior_answer()`` only when the chosen route produced a non-grounded fallback (``ASSUMPTION`` / ``EXISTING_CONVENTION`` / ``CONSERVATIVE_DEFAULT``) AND the safe-allowlist recognises the question as regulated-product. Concretely: - ``Which runtime should the GDPR export use?`` + REPO_FACT → REPO_FACT runtime answer (preserved, with evidence). - ``Which runtime should the GDPR export use?`` without repo facts → EXISTING_CONVENTION runtime fallback re-routed through ``_product_behavior_answer()`` so the regulated-feature semantics are preserved in ``constraints.behavior.*`` / ``acceptance.behavior.*``. - ``What inputs should the GDPR export take?`` → IO ASSUMPTION re-routed to ``_product_behavior_answer()``. - ``Should the app export PII reports?`` → already routes through ``_product_behavior_answer()`` (CONSERVATIVE_DEFAULT) and is left untouched by the reroute. Pure compliance phrasings still block: they fail the allowlist (no product-semantics verb), keep their CONSERVATIVE_DEFAULT/ASSUMPTION/ EXISTING_CONVENTION source, and the risky-fallback gate fires for them. New regression test ``test_auto_answerer_preserves_repo_fact_for_regulated_runtime_question`` locks the REPO_FACT preservation contract: with a runtime_context repo fact supplied, the answer must be REPO_FACT, must contain the supplied runtime text, and must carry a runtime_context ledger entry with the supplied evidence. 342 unit tests passing in tests/unit/auto. Ruff clean. Ref: ouroboros-agent[bot] BLOCKING on #738 — ``answerer.py:126``. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * chore: drop stray empty .ouroboros_eval_artifact.md committed by mistake The previous commit (046ce3d) accidentally included an empty local debug artifact via ``git add -A``. Removing it; not part of the PR scope. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * fix(auto): tighten bare-scope and ambiguous-artefact regex (#640) Two BLOCKING items raised by ouroboros-agent[bot] on commit a13fd6c: (1) ``answerer.py:793`` — ``_is_safe_product_regulated_question()`` allowed "compliance-scope-as-feature-flag" prompts (``Should the platform support HIPAA?``, ``Should the app enable GDPR?``, ``Should the system allow PII?``) to bypass the blocker, even though those frame the entire regulatory regime as a binary toggle and are compliance-policy decisions. Fix: a new ``_BARE_COMPLIANCE_SCOPE_RE`` rejects ``support|enable|allow`` + bare regulated noun followed by no qualifying feature noun (negative lookahead ``(?!\s+[a-z])``). Concrete-feature variants ("HIPAA audit logs", "GDPR consent banners", "PII redaction in exports", "GDPR data") have a qualifying noun and still pass through. (2) ``answerer.py:718`` — the destructive-bulk artefact qualifier listed standalone ``doc`` and ``plan`` tokens. ``from the doc`` is rare phrasing (use ``docs`` / ``documentation``) and bare ``plan`` collides with database-side meanings (query plan, execution plan, db plan), so a question like "Which tables should we drop from the plan?" was being exempted as a process-artefact edit. Fix: drop ``doc`` and ``plan`` (singular) from the artefact list. The remaining unambiguous artefacts are ``release plan``, ``docs``, ``documentation``, ``roadmap``, ``backlog``, ``changelog``, ``spec``. All existing positive tests already use these unambiguous variants. New regression coverage: - ``test_auto_answerer_blocks_bare_compliance_scope_questions`` — locks rejection of bare ``support|enable|allow + regulated noun`` for all five regulated-noun variants. - ``test_auto_answerer_allows_qualified_compliance_scope_questions`` — locks pass-through of ``support HIPAA audit logs`` / ``enable GDPR consent banners`` / ``allow PII redaction`` / etc. - ``test_auto_answerer_blocks_destructive_bulk_with_ambiguous_singular_tokens`` — locks blocker for ``from the plan`` / ``in the plan`` / ``from the doc`` destructive prompts. 345 unit tests passing in tests/unit/auto. Ruff clean. Ref: ouroboros-agent[bot] BLOCKING on #738 — ``answerer.py:793`` and ``answerer.py:718``. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> --------- Co-authored-by: Claude Sonnet 4.6 <[email protected]>
Summary
Add a narrow risky-fallback gate to the deterministic auto answerer so high-risk topics surface for human review instead of being silently filled with a generic default.
When the deterministic answerer would otherwise return
CONSERVATIVE_DEFAULTorASSUMPTIONfor a question whose topic has no defensible generic default, it now returns aBLOCKERwith a concrete reason.Targeted topics (intentionally narrow):
PII,personally identifiable information,GDPR,HIPAA,SOX,PCI-DSS.truncate/purgeoftable(s)orschema(s).Excluded by design (already covered or would over-trigger):
_blocker_forfor the explicit verb+target pairs (deploy/release/publish to/against/on production/prod/live/external)._blocker_for.drop ... database— already covered by_blocker_for.Why
This implements the fourth piece of #640's acceptance criteria:
The provenance taxonomy is in place after #646/#666, but the auto answerer was still willing to fabricate a generic answer for topics where no generic answer is safe. PR-B keeps the change deliberately narrow — only the topics where a fallback is unambiguously wrong — so the existing safe-allowlists for product-feature questions about credentials/branches keep working.
Behavior
_blocker_forruns first — explicit sensitive-authority questions still block at their original reason and message.CONSERVATIVE_DEFAULTorASSUMPTIONand the question matches a risky-fallback pattern and the safe-product allowlists do not exempt it, the answer is replaced with aBLOCKERcarrying the matched reason.REPO_FACT/USER_GOAL/EXISTING_CONVENTION/INFERENCEanswer is never gated (callers can still ground the answer with bounded repo facts viaAutoAnswerContext).Out of scope
Tests
Validation
Refs #640
Independent of #693 (provenance surface). Either PR can land first.