feat(auto): block risky-fallback answers for regulated topics by shaun0927 · Pull Request #695 · Q00/ouroboros

shaun0927 · 2026-05-07T04:27:27Z

Summary

Add a narrow risky-fallback gate to the deterministic auto answerer so high-risk topics surface for human review instead of being silently filled with a generic default.

When the deterministic answerer would otherwise return CONSERVATIVE_DEFAULT or ASSUMPTION for a question whose topic has no defensible generic default, it now returns a BLOCKER with a concrete reason.

Targeted topics (intentionally narrow):

Regulated personal data: PII, personally identifiable information, GDPR, HIPAA, SOX, PCI-DSS.
Destructive bulk schema/table operations: truncate/purge of table(s) or schema(s).

Excluded by design (already covered or would over-trigger):

Production deployment authority — already covered by _blocker_for for the explicit verb+target pairs (deploy/release/publish to/against/on production/prod/live/external).
Real-money authority for credit cards / billing accounts — already covered by _blocker_for.
Generic 'destructive' keywords like drop ... database — already covered by _blocker_for.

Why

This implements the fourth piece of #640's acceptance criteria:

block or ask the user when a required answer would otherwise be a risky fallback

The provenance taxonomy is in place after #646/#666, but the auto answerer was still willing to fabricate a generic answer for topics where no generic answer is safe. PR-B keeps the change deliberately narrow — only the topics where a fallback is unambiguously wrong — so the existing safe-allowlists for product-feature questions about credentials/branches keep working.

Behavior

_blocker_for runs first — explicit sensitive-authority questions still block at their original reason and message.
Deterministic routing then picks a category answerer (verification / acceptance / actor-IO / runtime / product behavior / default).
After routing, if the answer source is CONSERVATIVE_DEFAULT or ASSUMPTION and the question matches a risky-fallback pattern and the safe-product allowlists do not exempt it, the answer is replaced with a BLOCKER carrying the matched reason.
A REPO_FACT / USER_GOAL / EXISTING_CONVENTION / INFERENCE answer is never gated (callers can still ground the answer with bounded repo facts via AutoAnswerContext).

Out of scope

Selected-driver post-response risk classification (handled by Add post-response risk and confidence tagging for selected-driver auto answers #675 / feat(auto): add selected-driver answer metadata #682 / feat(auto): classify selected-driver answer risk #683).
Expanding the regulated-topic vocabulary beyond PII/GDPR/HIPAA/SOX/PCI-DSS (can grow incrementally).
Adding a manual-approval flow as the default path (kept off the autopilot critical path).

Tests

New `test_auto_answerer_blocks_regulated_data_questions_instead_of_falling_back` — PII / HIPAA / 'purge tables' all block with their matched reason.
New `test_auto_answerer_does_not_block_regulated_topic_when_repo_fact_supplied` — a HIPAA-adjacent runtime question with supplied repo facts still returns `REPO_FACT`.
New `test_auto_answerer_skips_risky_fallback_for_safe_product_credential_questions` — feature questions about credentials remain unblocked through the existing safe-product allowlists.
Existing `test_auto_answerer_allows_safe_production_and_project_feature_questions` and the production-credential authority test continue to pass.

Validation

`UV_CACHE_DIR=/tmp/uv-cache uv run pytest tests/unit/auto/ tests/unit/mcp/ -q` → 998 passed
`UV_CACHE_DIR=/tmp/uv-cache uv run ruff check src/ouroboros/auto/answerer.py tests/unit/auto/test_ledger_grading_answerer.py` → passed
`UV_CACHE_DIR=/tmp/uv-cache uv run ruff format --check` (same files) → passed

Refs #640
Independent of #693 (provenance surface). Either PR can land first.

When a deterministic answer would land on CONSERVATIVE_DEFAULT or ASSUMPTION for a high-risk topic that has no defensible generic default, upgrade it to a BLOCKER instead of silently committing the auto Seed to a fabricated stance. Targeted topics: - regulated personal data (PII, GDPR, HIPAA, SOX, PCI-DSS) - destructive bulk schema/table operations (truncate/purge tables/schemas) Existing safe-allowlists keep working: product-feature questions about credentials/branches and the explicit `_blocker_for` patterns are unchanged. REPO_FACT/USER_GOAL-backed answers also pass through without gating. Refs Q00#640

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Reviewing commit 6944525 for PR #695

Review record: 91c7e9e2-a3f6-4466-b71c-f40cebb2e5c2

Blocking Findings

---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:127 | BLOCKING | The new post-routing blocker turns any CONSERVATIVE_DEFAULT/ASSUMPTION answer into a hard stop whenever the question merely mentions HIPAA, GDPR, PII, etc. Because this runs after _is_feature_acceptance_question() and _is_verification_question(), benign prompts like What acceptance criteria should the HIPAA worker satisfy? or Which command output verifies the GDPR export flow? now return BLOCKER instead of the existing feature/verification guidance. That is a behavioral regression: these questions do not ask the model to decide regulated-data handling, but the broad keyword match in _RISKY_FALLBACK_PATTERNS treats them as if they do. |

Non-blocking Suggestions

None.

Design Notes

The change is directionally reasonable: blocking generic fallbacks for genuinely high-risk topics is safer than inventing defaults. The issue is that the current implementation keys off broad keywords after answer routing, so it catches safe meta-questions as well as the risky ones.

Reviewed by ouroboros-agent[bot] via Codex deep analysis

Address PR Q00#695 blocking review finding: The post-routing gate keyed off broad keywords, so meta-questions like "What acceptance criteria should the HIPAA worker satisfy?" or "Which command output verifies the GDPR export flow?" were rejected as if they asked for regulated-data handling decisions. They actually hit the `_feature_acceptance_answer` / `_verification_answer` routes which return safe templates regardless of subject keywords. Restructure `answer()` so the gate only fires after generative routes (actor/IO, runtime, product behavior, default). Meta-routes (non-goal listing, verification, feature acceptance) return early without going through the gate. Add regression coverage for HIPAA/GDPR/PII acceptance and verification phrasings to ensure they keep returning CONSERVATIVE_DEFAULT answers. Existing PII/HIPAA generative-route block tests continue to pass. Refs Q00#640

Q00

Reviewed. The regulated-topic risky fallback gate is a policy boundary, so it needs fully green tests. Checks are still running, so I cannot approve yet.

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Reviewing commit 0dc7134 for PR #695

Review record: e629351a-e803-4b69-9b75-183cf6155272

Blocking Findings

---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:127 | BLOCKING | The new risky-fallback gate does not actually cover generic runtime answers. answer() routes Which runtime should the HIPAA worker use? into _runtime_answer(), and when no runtime_context fact is supplied that path returns AutoAnswerSource.EXISTING_CONVENTION, not one of _RISKY_FALLBACK_SOURCES. As a result, regulated runtime questions still get the generic “use the existing repository runtime...” answer instead of blocking, which contradicts the new policy and leaves a high-risk fallback path open. |

Non-blocking Suggestions

None.

Design Notes

The change is directionally sound: it separates explicit hard blockers from softer generic-answer routes. The main issue is that the enforcement key (answer.source) does not align with the documented route-level policy, so one fallback class still escapes the gate.

Reviewed by ouroboros-agent[bot] via Codex deep analysis

Address PR Q00#695 follow-up: a regulated runtime question without a supplied repo_fact, e.g. "Which runtime should the HIPAA worker use?", routes through `_runtime_answer` and returns `AutoAnswerSource.EXISTING_CONVENTION` with the generic "use the existing repository runtime" template. Because EXISTING_CONVENTION was not in `_RISKY_FALLBACK_SOURCES`, that fallback escaped the gate. Add EXISTING_CONVENTION to the risky-fallback set. REPO_FACT-backed runtime answers (full `runtime_context` supplied) remain unaffected, and the existing `does_not_block_regulated_topic_when_repo_fact_supplied` test continues to assert that REPO_FACT answers pass through. Add a regression test that covers the bot's exact case: HIPAA runtime question with no supplied facts now blocks with reason "regulated data handling". Refs Q00#640

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Reviewing commit bee8570 for PR #695

Review record: 4487f146-f834-41ad-91ef-de76a8de6793

Blocking Findings

---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:659 | BLOCKING | The new destructive-operation gate is too narrow and order-dependent. It only matches `truncate |

Non-blocking Suggestions

None.

Design Notes

The routing change is localized and the added tests cover the intended happy paths, but the safety gate currently relies on brittle keyword regexes. For high-risk topics, that matcher needs broader verb coverage and symmetric phrasing support to be dependable.

Reviewed by ouroboros-agent[bot] via Codex deep analysis

Address PR Q00#695 follow-up: the destructive-operation matcher only caught ``verb ... noun`` phrasings such as ``purge tables``, so reverse phrasings like ``Which tables should the migration truncate?`` slipped through. The verb vocabulary was also narrow. Expand patterns: - Verbs: ``truncate``, ``purge``, ``wipe`` plus tense variants (``truncates``/``truncating``/``truncated`` etc.). - Nouns: ``table(s)``, ``schema(s)``, ``database(s)``, ``index/indexes/indices``, ``migration(s)``. - Both verb-then-noun and noun-then-verb directions matched. Note: ``drop ... database`` remains owned by ``_blocker_for`` (its existing branch fires first), and product-feature questions are still exempted by the safe-product allowlists, so this does not over-gate benign feature semantics. Add a regression test exercising both phrasing directions across the new verb/noun vocabulary. Refs Q00#640

…ious fix

Q00

Reviewed across OS/UserLevel/Program boundaries, auto scope, and UX complexity. Approving: the risky-fallback gate narrows automation for regulated/destructive topics, which reduces scope risk rather than expanding it.

ouroboros-agent

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Reviewing commit 007e254 for PR #695

Review record: b2b2bb66-208d-4029-8a96-385ad5a6ea68

Blocking Findings

---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:659 | BLOCKING | The new destructive-bulk fallback gate still misses common destructive verbs such as drop and erase. After this change, questions like Which tables should the migration drop? or Should we erase these schemas? will still flow through to a generic auto answer instead of blocking, even though the older _blocker_for policy already treated those verbs as destructive in related contexts. The added test coverage at tests/unit/auto/test_ledger_grading_answerer.py:1069 only exercises truncate/purge/wipe, so this regression path remains untested. |

Non-blocking Suggestions

None.

Design Notes

The routing change is directionally sound: it preserves explicit repo-fact answers while preventing generic fallbacks on higher-risk topics. The main weakness is that the new regex gate is now a separate policy surface from _blocker_for, so vocabulary drift between the two lists can leave obvious holes unless they are kept aligned.

Reviewed by ouroboros-agent[bot] via Codex deep analysis

shaun0927 · 2026-05-07T06:33:51Z

Cross-linking from #689 (closed as out-of-scope for core).

Surfacing a boundary concern, not asking for an immediate decision: this PR places the regulated-topic vocabulary (PII / GDPR / HIPAA / SOX / PCI-DSS) and destructive bulk-schema verbs (truncate / purge of tables / schemas) directly inside src/ouroboros/auto/answerer.py.

The surface — block instead of silently fabricating a CONSERVATIVE_DEFAULT for risky topics — fits core (safety boundary). What I am less sure about is the vocabulary living in core:

The set is team- and jurisdiction-specific (e.g. some teams care about KYC/AML; others care about FERPA; others care about export-control). A core PR that hard-codes one regional/regulatory subset will keep accreting siblings.
It is exactly the same pattern @Q00 closed Add direct operational-task path for concrete PR and merge goals in ooo auto #689 PRs for: a typed classifier with a hard-coded vocabulary in core, instead of "stable primitive + policy lookup."

Possible reframings (any of these would address the concern without losing the safety property):

Keep the gate in core, move the vocabulary to a config/policy file the gate consults. Default config can ship the current PII/GDPR/HIPAA/SOX/PCI-DSS list so behavior is unchanged; teams can extend without patching core.
Or: keep core's job as "any CONSERVATIVE_DEFAULT / ASSUMPTION for an answer marked sensitive becomes a BLOCKER" and let an external classifier decide what is sensitive. Core then has zero domain words.
Or: explicitly accept this as a narrow temporary primitive (with a comment + a tracker for plugin-extraction) and freeze the vocabulary at exactly the current 5 + bulk-schema verbs.

Happy to take any of those directions — flagging now so the boundary call is explicit before merge rather than litigated later.

shaun0927 · 2026-05-07T06:42:07Z

Following up on #725 ("Design a UserLevel plugin manager for operational workflows"): recommend HOLD on this PR until #725 v0 contract lands.

Reasons:

The current diff places regulated-topic vocabulary (PII / GDPR / HIPAA / SOX / PCI-DSS) and destructive bulk-schema verbs directly inside src/ouroboros/auto/answerer.py. Per the Design a UserLevel plugin manager for operational workflows #725 boundary, vocabulary belongs in a UserLevel skill, not in core auto.
Merging now would create the same anti-pattern that got Add direct operational-task path for concrete PR and merge goals in ooo auto #689 closed: core slowly accumulates domain words. After Design a UserLevel plugin manager for operational workflows #725 v0 (RiskAssessor protocol + skill-owned config), this PR's intent can land as (a) a one-liner core change introducing RiskAssessor, plus (b) a default-shipping skill that owns the current vocabulary and can be replaced/extended without patching core.

The gate (block instead of fabricate when the answerer would otherwise return CONSERVATIVE_DEFAULT for a topic with no defensible default) is correct and should land in core. Only the vocabulary list and matching logic moves out.

Concrete suggested redesign (after #725 v0):

core: RiskAssessor protocol with one method assess(question, candidate_answer) -> RiskVerdict.
core: existing _blocker_for keeps its current behavior; new gate runs RiskAssessor after the deterministic answerer returns and converts a non-empty verdict to a BLOCKER.
skill: regulated-topics-default ships with the same PII/GDPR/HIPAA/SOX/PCI-DSS list as today — behavior unchanged for users who install nothing extra.
users: can install a different RiskAssessor-providing skill (e.g. regulated-topics-fintech, regulated-topics-healthcare) without forking core.

Happy to split this PR into the core gate + the default skill once the v0 manifest lands. Until then, holding so the contract isn't pre-committed to.

Address PR Q00#695 follow-up: phrasings like "Which tables should the migration drop?" or "Should we erase these schemas before re-seeding?" still flowed through to a generic auto answer because ``drop`` and ``erase`` were missing from `_DESTRUCTIVE_BULK_VERBS`. `_blocker_for` already handles ``drop|delete|erase|wipe ... database`` at the explicit-authority layer, so this matcher and the existing allow/deny list both treat the same families consistently. The risky-fallback gate runs after `_blocker_for`, so explicit authority-style prompts continue to block via the original code path. Add ``drop`` and ``erase`` (with tense variants) and extend the regression test to assert both verb-then-noun and noun-then-verb phrasings using the new vocabulary. Refs Q00#640

shaun0927 · 2026-05-07T06:43:43Z

Closing this PR per the boundary established in #689 / #725 v0 discussion.

The intent is correct (block instead of fabricate for risky topics) and will return after #725 v0 lands as two cleaner pieces:

core RiskAssessor protocol + the gate semantics (small, generic),
a default-shipping regulated-topics-default skill carrying the current PII / GDPR / HIPAA / SOX / PCI-DSS vocabulary.

That split keeps the safety property (block instead of silently fabricate) without committing core to a specific regional/regulatory vocabulary.

Tests in this PR (test_auto_answerer_blocks_regulated_data_questions_instead_of_falling_back, ..._does_not_block_regulated_topic_when_repo_fact_supplied, ..._skips_risky_fallback_for_safe_product_credential_questions) will be ported to the post-v0 implementation.

Refs #725.

Address PR Q00#695 blocking review finding: The post-routing gate keyed off broad keywords, so meta-questions like "What acceptance criteria should the HIPAA worker satisfy?" or "Which command output verifies the GDPR export flow?" were rejected as if they asked for regulated-data handling decisions. They actually hit the `_feature_acceptance_answer` / `_verification_answer` routes which return safe templates regardless of subject keywords. Restructure `answer()` so the gate only fires after generative routes (actor/IO, runtime, product behavior, default). Meta-routes (non-goal listing, verification, feature acceptance) return early without going through the gate. Add regression coverage for HIPAA/GDPR/PII acceptance and verification phrasings to ensure they keep returning CONSERVATIVE_DEFAULT answers. Existing PII/HIPAA generative-route block tests continue to pass. Refs Q00#640

Address PR Q00#695 follow-up: a regulated runtime question without a supplied repo_fact, e.g. "Which runtime should the HIPAA worker use?", routes through `_runtime_answer` and returns `AutoAnswerSource.EXISTING_CONVENTION` with the generic "use the existing repository runtime" template. Because EXISTING_CONVENTION was not in `_RISKY_FALLBACK_SOURCES`, that fallback escaped the gate. Add EXISTING_CONVENTION to the risky-fallback set. REPO_FACT-backed runtime answers (full `runtime_context` supplied) remain unaffected, and the existing `does_not_block_regulated_topic_when_repo_fact_supplied` test continues to assert that REPO_FACT answers pass through. Add a regression test that covers the bot's exact case: HIPAA runtime question with no supplied facts now blocks with reason "regulated data handling". Refs Q00#640

Address PR Q00#695 follow-up: the destructive-operation matcher only caught ``verb ... noun`` phrasings such as ``purge tables``, so reverse phrasings like ``Which tables should the migration truncate?`` slipped through. The verb vocabulary was also narrow. Expand patterns: - Verbs: ``truncate``, ``purge``, ``wipe`` plus tense variants (``truncates``/``truncating``/``truncated`` etc.). - Nouns: ``table(s)``, ``schema(s)``, ``database(s)``, ``index/indexes/indices``, ``migration(s)``. - Both verb-then-noun and noun-then-verb directions matched. Note: ``drop ... database`` remains owned by ``_blocker_for`` (its existing branch fires first), and product-feature questions are still exempted by the safe-product allowlists, so this does not over-gate benign feature semantics. Add a regression test exercising both phrasing directions across the new verb/noun vocabulary. Refs Q00#640

Address PR Q00#695 follow-up: phrasings like "Which tables should the migration drop?" or "Should we erase these schemas before re-seeding?" still flowed through to a generic auto answer because ``drop`` and ``erase`` were missing from `_DESTRUCTIVE_BULK_VERBS`. `_blocker_for` already handles ``drop|delete|erase|wipe ... database`` at the explicit-authority layer, so this matcher and the existing allow/deny list both treat the same families consistently. The risky-fallback gate runs after `_blocker_for`, so explicit authority-style prompts continue to block via the original code path. Add ``drop`` and ``erase`` (with tense variants) and extend the regression test to assert both verb-then-noun and noun-then-verb phrasings using the new vocabulary. Refs Q00#640

…pics (#640) (#738) * feat(auto): block risky-fallback answers for regulated topics When a deterministic answer would land on CONSERVATIVE_DEFAULT or ASSUMPTION for a high-risk topic that has no defensible generic default, upgrade it to a BLOCKER instead of silently committing the auto Seed to a fabricated stance. Targeted topics: - regulated personal data (PII, GDPR, HIPAA, SOX, PCI-DSS) - destructive bulk schema/table operations (truncate/purge tables/schemas) Existing safe-allowlists keep working: product-feature questions about credentials/branches and the explicit `_blocker_for` patterns are unchanged. REPO_FACT/USER_GOAL-backed answers also pass through without gating. Refs #640 * fix(auto): scope risky-fallback gate to generative answer routes only Address PR #695 blocking review finding: The post-routing gate keyed off broad keywords, so meta-questions like "What acceptance criteria should the HIPAA worker satisfy?" or "Which command output verifies the GDPR export flow?" were rejected as if they asked for regulated-data handling decisions. They actually hit the `_feature_acceptance_answer` / `_verification_answer` routes which return safe templates regardless of subject keywords. Restructure `answer()` so the gate only fires after generative routes (actor/IO, runtime, product behavior, default). Meta-routes (non-goal listing, verification, feature acceptance) return early without going through the gate. Add regression coverage for HIPAA/GDPR/PII acceptance and verification phrasings to ensure they keep returning CONSERVATIVE_DEFAULT answers. Existing PII/HIPAA generative-route block tests continue to pass. Refs #640 * fix(auto): include EXISTING_CONVENTION runtime fallback in risky gate Address PR #695 follow-up: a regulated runtime question without a supplied repo_fact, e.g. "Which runtime should the HIPAA worker use?", routes through `_runtime_answer` and returns `AutoAnswerSource.EXISTING_CONVENTION` with the generic "use the existing repository runtime" template. Because EXISTING_CONVENTION was not in `_RISKY_FALLBACK_SOURCES`, that fallback escaped the gate. Add EXISTING_CONVENTION to the risky-fallback set. REPO_FACT-backed runtime answers (full `runtime_context` supplied) remain unaffected, and the existing `does_not_block_regulated_topic_when_repo_fact_supplied` test continues to assert that REPO_FACT answers pass through. Add a regression test that covers the bot's exact case: HIPAA runtime question with no supplied facts now blocks with reason "regulated data handling". Refs #640 * fix(auto): broaden destructive-bulk patterns and cover reverse phrasing Address PR #695 follow-up: the destructive-operation matcher only caught ``verb ... noun`` phrasings such as ``purge tables``, so reverse phrasings like ``Which tables should the migration truncate?`` slipped through. The verb vocabulary was also narrow. Expand patterns: - Verbs: ``truncate``, ``purge``, ``wipe`` plus tense variants (``truncates``/``truncating``/``truncated`` etc.). - Nouns: ``table(s)``, ``schema(s)``, ``database(s)``, ``index/indexes/indices``, ``migration(s)``. - Both verb-then-noun and noun-then-verb directions matched. Note: ``drop ... database`` remains owned by ``_blocker_for`` (its existing branch fires first), and product-feature questions are still exempted by the safe-product allowlists, so this does not over-gate benign feature semantics. Add a regression test exercising both phrasing directions across the new verb/noun vocabulary. Refs #640 * chore: drop stray local debug artifact accidentally committed in previous fix * fix(auto): add drop and erase to destructive-bulk verb vocabulary Address PR #695 follow-up: phrasings like "Which tables should the migration drop?" or "Should we erase these schemas before re-seeding?" still flowed through to a generic auto answer because ``drop`` and ``erase`` were missing from `_DESTRUCTIVE_BULK_VERBS`. `_blocker_for` already handles ``drop|delete|erase|wipe ... database`` at the explicit-authority layer, so this matcher and the existing allow/deny list both treat the same families consistently. The risky-fallback gate runs after `_blocker_for`, so explicit authority-style prompts continue to block via the original code path. Add ``drop`` and ``erase`` (with tense variants) and extend the regression test to assert both verb-then-noun and noun-then-verb phrasings using the new vocabulary. Refs #640 * fix(auto): require schema/data context for destructive-bulk gate (#640) Add _DESTRUCTIVE_BULK_NON_DATA_QUALIFIERS to exempt verb/noun pairs that appear with a non-data artefact qualifier (release plan, docs, roadmap, etc.) from the destructive-bulk blocker. Also extend _DESTRUCTIVE_BULK_NOUNS with record/row/audit-log/audit-trail strong data-object nouns. Addresses ouroboros-agent[bot] follow-up warning on #738: bare ``migration`` + ``drop`` and ``index`` + ``drop`` questions about release plans or documentation were overblocked. Co-Authored-By: Claude Sonnet 4.6 <[email protected]> * fix(auto): allow product-semantics questions through risky-fallback gate (#640) Add _is_safe_product_regulated_question() allowlist that passes through bounded product-behavior questions mentioning regulated nouns (PII/GDPR/ HIPAA/SOX/PCI-DSS) when paired with a product-semantics verb (export, download, display, show, view, access, …) and NOT a compliance-policy verb (store, handle, retain, collect, encrypt, …). Questions like "Should the app export PII reports?" or "Should users be able to download GDPR exports?" are feature-level requirements and must not be blocked; questions asking how to store/handle/retain regulated data still block as before. Addresses ouroboros-agent[bot] BLOCKING on #738 (answerer.py:716). Co-Authored-By: Claude Sonnet 4.6 <[email protected]> * fix(auto): route safe regulated-product questions through product-behavior answerer (#640) Extend _is_product_behavior_question() with a new arm covering the product-semantics verbs used by _is_safe_product_regulated_question() that were not previously matched (download, allow, expose, render, enable, support) and the "be able to <verb>" phrasing gap for view/access. Previously, questions allowed past the risky-fallback gate (e.g. "Should users be able to download GDPR exports?") fell through to _default_answer(), producing a generic conservative-MVP ledger entry that silently discarded the regulated-product feature semantics. With this fix the router at answerer.py:122 sends those questions to _product_behavior_answer(), which writes subject-specific constraints.behavior.* and acceptance.behavior.* ledger entries that preserve the requested feature in the Seed contract. New test: test_auto_answerer_routes_safe_regulated_product_questions_to_product_behavior_answerer asserts blocker=None, source=CONSERVATIVE_DEFAULT, subject-specific ledger keys (not conservative_mvp), and regulated noun present in answer text/ledger entries for all three bot example questions. Addresses ouroboros-agent[bot] BLOCKING on #738 (answerer.py:741). Co-Authored-By: Claude Sonnet 4.6 <[email protected]> * fix(auto): phrase-scope destructive-bulk non-data qualifier (#640) Previously the destructive-bulk exemption matched on bare tokens such as ``documentation`` or ``release plan`` anywhere in the sentence, which let real destructive operations slip past the gate when the question merely *referenced* documentation as an authority (e.g. "Which tables should we drop according to the documentation before redeploying?"). The qualifier is now strictly phrase-scoped to ``from the …`` so the exemption fires only when the artefact is the explicit object of the drop/wipe — the phrasing that signals "remove an entry from a process artefact" rather than "delete data from a system". Authority/reference phrasings ("according to the documentation", "per the release plan", "in the documentation example") no longer suppress the gate. Existing pass-through tests still hold: - "Which migration should we drop from the release plan?" → not blocked - "Which indexes should we drop from the docs?" → not blocked New regression test locks the safety boundary: - "Which tables should we drop according to the documentation …" → BLOCKER - "Which tables should we drop per the release plan?" → BLOCKER - "Per the documentation, which audit logs should we purge?" → BLOCKER - "According to the docs, which tables should we drop?" → BLOCKER Ref: ouroboros-agent[bot] BLOCKING on PR #738 — answerer.py:688. 68 tests passing in test_ledger_grading_answerer.py (337 in tests/unit/auto). Ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * fix(auto): widen artefact qualifier and trust adjectival compliance verbs (#640) Two BLOCKING regressions raised by ouroboros-agent[bot] on the previous fix commit (fc11788): 1) **answerer.py:698** — destructive-bulk exemption only matched ``from the …`` so safe process-artefact phrasings such as ``Which indexes should we drop in the docs?`` and ``Which migration should we drop in the roadmap?`` were still mis-blocked as data destruction. The qualifier now also accepts ``in the …`` for the same artefact list (``release plan``, ``docs``, ``documentation``, ``plan``, ``roadmap``, ``backlog``, ``changelog``, ``spec``). Authority/reference phrasings (``according to the docs``, ``per the release plan``) still do not match the qualifier and remain blocked, locked in by an expanded regression test. 2) **answerer.py:782** — ``_is_safe_product_regulated_question()`` rejected any question containing a compliance-policy verb (``store``, ``handle``, ``encrypt``, ``share``, …) anywhere in the sentence. That over-blocked legitimate product-behavior questions where the compliance verb appeared as a past-participle adjective modifying the noun, e.g. Should admins be able to view stored PII fields? Should the dashboard display encrypted HIPAA files? In both, the main verb is product-semantics (``view`` / ``display``); the compliance verb is adjectival. The allowlist now requires (a) a regulated noun, (b) a product-question modal, and (c) a product-semantics verb. Pure compliance-policy phrasings (``How should the system handle GDPR data retention?``, ``What PII should the system collect?``) lack a product-semantics verb and remain blocked — covered by the existing ``test_auto_answerer_still_blocks_compliance_policy_regulated_questions``. The previously-defined ``_COMPLIANCE_POLICY_VERBS_RE`` constant is now unused and removed to avoid dead code. New regression coverage: - ``test_auto_answerer_allows_in_the_artefact_drop_questions`` — locks in ``in the docs/roadmap/release plan/changelog`` exemption. - ``test_auto_answerer_allows_product_questions_with_adjectival_compliance_verbs`` — locks in ``view stored PII``, ``display encrypted HIPAA files``, ``download retained GDPR exports``, etc. Existing safety tests (compliance-policy questions, authority-reference phrasings) all continue to block. 339 unit tests passing in tests/unit/auto. Ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * fix(auto): block mixed-intent regulated questions via active-verb precedence (#640) Bot follow-up on commit e846a47: the regulated-product allowlist was too permissive. Mixed-intent questions that pair a compliance-policy verb with a product-semantics verb — e.g. How should the system store and display HIPAA files? Should we retain and export PII records? still ask the auto pipeline to decide regulated-data handling and must remain blocked, even though they also mention a product verb. The fix adds an explicit precedence rule: an *active*-form compliance-policy verb (``store`` / ``stores`` / ``storing``, ``retain`` / ``retains`` / ``retaining``, ``encrypt``, ``handle``, ``collect``, ``share``, ``transmit``, ``disclose``, ``process``, ``manage``, ``govern``) blocks the question even if a product-semantics verb is also present. Past-participle forms (``stored``, ``encrypted``, ``retained``, ``collected``, ``shared``, …) are intentionally excluded from the negative list because they act adjectivally on the regulated noun (``view stored PII``, ``display encrypted HIPAA files``); the main verb of those sentences is the product-semantics one and the question is product-behavior over already- existing regulated data, not a compliance-policy decision. New regression test ``test_auto_answerer_blocks_mixed_intent_regulated_questions`` locks the precedence rule on the bot's own examples plus three more variants covering ``encrypt`` / ``share`` / ``collect``. Existing positive tests (adjectival compliance verbs, pure product semantics) and existing negative tests (pure compliance phrasings) all continue to pass. 340 unit tests passing in tests/unit/auto. Ruff clean. Ref: ouroboros-agent[bot] BLOCKING on #738 — ``answerer.py:750``. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * fix(auto): align router verb list with regulated allowlist (#640) Bot follow-up on commit 7ed761c: ``_PRODUCT_SEMANTICS_REGULATED_VERBS_RE`` allows ``export`` and ``show`` (and the rest of the safe-allowlist set), but the explicit alignment branch in ``_is_product_behavior_question()`` only listed a subset (``download/allow/expose/render/enable/support/view/access``). ``export`` / ``show`` / ``display`` were still matched by an earlier broader pattern in the same function, but the visible alignment was incomplete and prone to silent drift. The router branch added to bridge ``_is_safe_product_regulated_question()`` into ``_is_product_behavior_question()`` now lists every verb in the allowlist: export | download | render | display | show | expose | support | enable | allow | view | access This is a no-op for already-routed verbs but makes the allowlist↔router contract explicit and grep-checkable, eliminating the drift surface flagged in the bot's design note. Test changes: ``test_auto_answerer_routes_safe_regulated_product_questions_to_product_behavior_answerer`` now exercises every verb in the allowlist (export, show, display, render, expose, support, enable in addition to download/view/access). Each case asserts (a) the gate passes (``answer.blocker is None``), (b) the router takes the product-behavior path (``constraints.behavior.*`` and ``acceptance.behavior.*`` ledger keys, not the generic ``constraints.conservative_mvp`` from ``_default_answer()``), and (c) the regulated noun is preserved in the answer text or ledger value. 340 unit tests passing in tests/unit/auto. Ruff clean. Ref: ouroboros-agent[bot] BLOCKING on #738 — ``answerer.py:548``. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * fix(auto): route regulated-product questions before IO/runtime branches (#640) Bot follow-up on commit 52ef5ab: ``_is_safe_product_regulated_question()`` suppressed the risky-fallback blocker for any regulated-noun + product-verb combination, but the router checked ``_is_actor_or_io_question`` and ``_is_runtime_context_question`` *before* ``_is_product_behavior_question``, so prompts like What inputs should the GDPR export take? Which runtime should the GDPR export use? got a generic IO/runtime answer (``ASSUMPTION`` / ``EXISTING_CONVENTION``) and then bypassed the blocker via the safe-allowlist — silently dropping the regulated-feature semantics from the ledger. Fix: pull ``_is_safe_product_regulated_question`` to the top of the content-routing chain so any regulated-product question — IO-shaped, runtime-shaped, or product-shaped — is dispatched to ``_product_behavior_answer()``. The risky-fallback gate at the tail of ``answer()`` already consults the same predicate, so the router and the safe-allowlist now share a single answer path. Pure compliance phrasings remain blocked unchanged: they fail the allowlist (no product-semantics verb) and fall through to the previous branches, where the risky-fallback gate fires for any ``CONSERVATIVE_DEFAULT`` / ``ASSUMPTION`` / ``EXISTING_CONVENTION`` source. New regression test ``test_auto_answerer_routes_regulated_product_questions_before_io_or_runtime`` locks in: - Bot's example "What inputs should the GDPR export take?" - Bot's example "Which runtime should the GDPR export use?" - Two adjacent IO/runtime regulated-product variants Each case asserts (a) not blocked, (b) answer comes from ``_product_behavior_answer()`` (subject-specific ``behavior.*`` ledger keys, no IO/runtime keys), and (c) the regulated noun is preserved in the answer text or ledger value. 341 unit tests passing in tests/unit/auto. Ruff clean. The two failures in tests/unit/orchestrator/test_codex_cli_runtime.py are pre-existing and unrelated to this PR's scope (verified by stashing the patch). Ref: ouroboros-agent[bot] BLOCKING on #738 — ``answerer.py:837``. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * fix(auto): preserve grounded REPO_FACT for regulated-runtime questions (#640) Bot follow-up on commit 44e405f: the unconditional early route to ``_product_behavior_answer()`` for any question that ``_is_safe_product_regulated_question()`` recognised broke the existing runtime contract. With a supplied ``runtime_context`` repo fact, a question like ``Which runtime should the GDPR export use?`` should return a ``REPO_FACT`` runtime answer carrying the grounded evidence; the early route replaced it with a generic product-behavior entry, dropping the evidence. Restructure: keep the original IO/runtime/product/default order so grounded ``REPO_FACT`` answers stay on the runtime path, then re-route to ``_product_behavior_answer()`` only when the chosen route produced a non-grounded fallback (``ASSUMPTION`` / ``EXISTING_CONVENTION`` / ``CONSERVATIVE_DEFAULT``) AND the safe-allowlist recognises the question as regulated-product. Concretely: - ``Which runtime should the GDPR export use?`` + REPO_FACT → REPO_FACT runtime answer (preserved, with evidence). - ``Which runtime should the GDPR export use?`` without repo facts → EXISTING_CONVENTION runtime fallback re-routed through ``_product_behavior_answer()`` so the regulated-feature semantics are preserved in ``constraints.behavior.*`` / ``acceptance.behavior.*``. - ``What inputs should the GDPR export take?`` → IO ASSUMPTION re-routed to ``_product_behavior_answer()``. - ``Should the app export PII reports?`` → already routes through ``_product_behavior_answer()`` (CONSERVATIVE_DEFAULT) and is left untouched by the reroute. Pure compliance phrasings still block: they fail the allowlist (no product-semantics verb), keep their CONSERVATIVE_DEFAULT/ASSUMPTION/ EXISTING_CONVENTION source, and the risky-fallback gate fires for them. New regression test ``test_auto_answerer_preserves_repo_fact_for_regulated_runtime_question`` locks the REPO_FACT preservation contract: with a runtime_context repo fact supplied, the answer must be REPO_FACT, must contain the supplied runtime text, and must carry a runtime_context ledger entry with the supplied evidence. 342 unit tests passing in tests/unit/auto. Ruff clean. Ref: ouroboros-agent[bot] BLOCKING on #738 — ``answerer.py:126``. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * chore: drop stray empty .ouroboros_eval_artifact.md committed by mistake The previous commit (046ce3d) accidentally included an empty local debug artifact via ``git add -A``. Removing it; not part of the PR scope. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * fix(auto): tighten bare-scope and ambiguous-artefact regex (#640) Two BLOCKING items raised by ouroboros-agent[bot] on commit a13fd6c: (1) ``answerer.py:793`` — ``_is_safe_product_regulated_question()`` allowed "compliance-scope-as-feature-flag" prompts (``Should the platform support HIPAA?``, ``Should the app enable GDPR?``, ``Should the system allow PII?``) to bypass the blocker, even though those frame the entire regulatory regime as a binary toggle and are compliance-policy decisions. Fix: a new ``_BARE_COMPLIANCE_SCOPE_RE`` rejects ``support|enable|allow`` + bare regulated noun followed by no qualifying feature noun (negative lookahead ``(?!\s+[a-z])``). Concrete-feature variants ("HIPAA audit logs", "GDPR consent banners", "PII redaction in exports", "GDPR data") have a qualifying noun and still pass through. (2) ``answerer.py:718`` — the destructive-bulk artefact qualifier listed standalone ``doc`` and ``plan`` tokens. ``from the doc`` is rare phrasing (use ``docs`` / ``documentation``) and bare ``plan`` collides with database-side meanings (query plan, execution plan, db plan), so a question like "Which tables should we drop from the plan?" was being exempted as a process-artefact edit. Fix: drop ``doc`` and ``plan`` (singular) from the artefact list. The remaining unambiguous artefacts are ``release plan``, ``docs``, ``documentation``, ``roadmap``, ``backlog``, ``changelog``, ``spec``. All existing positive tests already use these unambiguous variants. New regression coverage: - ``test_auto_answerer_blocks_bare_compliance_scope_questions`` — locks rejection of bare ``support|enable|allow + regulated noun`` for all five regulated-noun variants. - ``test_auto_answerer_allows_qualified_compliance_scope_questions`` — locks pass-through of ``support HIPAA audit logs`` / ``enable GDPR consent banners`` / ``allow PII redaction`` / etc. - ``test_auto_answerer_blocks_destructive_bulk_with_ambiguous_singular_tokens`` — locks blocker for ``from the plan`` / ``in the plan`` / ``from the doc`` destructive prompts. 345 unit tests passing in tests/unit/auto. Ruff clean. Ref: ouroboros-agent[bot] BLOCKING on #738 — ``answerer.py:793`` and ``answerer.py:718``. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> --------- Co-authored-by: Claude Sonnet 4.6 <[email protected]>

ouroboros-agent Bot requested changes May 7, 2026

View reviewed changes

Q00 requested changes May 7, 2026

View reviewed changes

Q00 added OS Core engine, state machine, internal pipeline, and system-level behavior Safety Risk, guardrail, policy, and regulated-topic behavior labels May 7, 2026

ouroboros-agent Bot requested changes May 7, 2026

View reviewed changes

shaun0927 added 2 commits May 7, 2026 15:08

chore: drop stray local debug artifact accidentally committed in prev…

007e254

…ious fix

Q00 approved these changes May 7, 2026

View reviewed changes

shaun0927 mentioned this pull request May 7, 2026

Add provenance and repository grounding to ooo auto interview answers #640

Open

6 tasks

ouroboros-agent Bot requested changes May 7, 2026

View reviewed changes

shaun0927 mentioned this pull request May 7, 2026

Add direct operational-task path for concrete PR and merge goals in ooo auto #689

Closed

shaun0927 mentioned this pull request May 7, 2026

Design a UserLevel plugin manager for operational workflows #725

Open

This was referenced May 7, 2026

feat(auto): add selected-driver answer metadata #682

Merged

feat(auto): classify selected-driver answer risk #683

Closed

shaun0927 closed this May 7, 2026

This was referenced May 7, 2026

Add selected driver and brake mode for ooo auto interviews #665

Closed

feat(auto): wire driver selection through capabilities #672

Merged

shaun0927 mentioned this pull request May 7, 2026

feat(auto): block risky-fallback answers for regulated/destructive topics (#640) #738

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(auto): block risky-fallback answers for regulated topics#695

feat(auto): block risky-fallback answers for regulated topics#695
shaun0927 wants to merge 6 commits intoQ00:mainfrom
shaun0927:feat/640-risky-fallback-gate

shaun0927 commented May 7, 2026

Uh oh!

ouroboros-agent Bot left a comment

Uh oh!

Q00 left a comment •

edited

Loading

Uh oh!

ouroboros-agent Bot left a comment

Uh oh!

ouroboros-agent Bot left a comment

Uh oh!

Q00 left a comment

Uh oh!

ouroboros-agent Bot left a comment

Uh oh!

shaun0927 commented May 7, 2026

Uh oh!

shaun0927 commented May 7, 2026

Uh oh!

shaun0927 commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shaun0927 commented May 7, 2026

Summary

Why

Behavior

Out of scope

Tests

Validation

Uh oh!

ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

Blocking Findings

Non-blocking Suggestions

Design Notes

Uh oh!

Q00 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

Blocking Findings

Non-blocking Suggestions

Design Notes

Uh oh!

ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

Blocking Findings

Non-blocking Suggestions

Design Notes

Uh oh!

Q00 left a comment

Choose a reason for hiding this comment

Uh oh!

ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

Review — ouroboros-agent[bot]

Blocking Findings

Non-blocking Suggestions

Design Notes

Uh oh!

shaun0927 commented May 7, 2026

Uh oh!

shaun0927 commented May 7, 2026

Uh oh!

shaun0927 commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Q00 left a comment •

edited

Loading