Skip to content

feat(auto): block risky-fallback answers for regulated/destructive topics (#640)#738

Merged
shaun0927 merged 17 commits intoQ00:mainfrom
shaun0927:feat/640-risky-fallback-gate
May 7, 2026
Merged

feat(auto): block risky-fallback answers for regulated/destructive topics (#640)#738
shaun0927 merged 17 commits intoQ00:mainfrom
shaun0927:feat/640-risky-fallback-gate

Conversation

@shaun0927
Copy link
Copy Markdown
Collaborator

Summary

Re-do of #695. @Q00 previously APPROVED the original PR ("narrows automation for regulated/destructive topics, reduces scope risk") but ouroboros-agent[bot] kept REQUEST_CHANGES on the risky-fallback gate logic. The follow-up commits in this branch address the bot's feedback:

  • Scope risky-fallback gate to generative answer routes only (no false-trigger on user-supplied evidence).
  • Include EXISTING_CONVENTION runtime fallback in the gate.
  • Broaden destructive-bulk pattern coverage (incl. reverse phrasing).
  • Extend destructive-bulk verb vocabulary (drop, erase, ...).

Behavior

  • Regulated topics (PII / GDPR / HIPAA / SOX / PCI-DSS) → block risky-fallback, surface explicit "needs human input" instead of silently filling.
  • Destructive bulk verbs (drop / truncate / delete-all / wipe / erase / ...) → block generative fallback.
  • Gate applies only to generative routes; user-supplied evidence still passes through.

Test plan

  • pytest tests/unit/auto/test_ledger_grading_answerer.py (covers regulated-topic gate, destructive-bulk gate, scope-gating, reverse phrasing).

Refs #640 (closing criteria 2/2 — paired with the provenance meta-only PR for the first half)

shaun0927 added 6 commits May 7, 2026 22:33
When a deterministic answer would land on CONSERVATIVE_DEFAULT or
ASSUMPTION for a high-risk topic that has no defensible generic
default, upgrade it to a BLOCKER instead of silently committing the
auto Seed to a fabricated stance.

Targeted topics:
- regulated personal data (PII, GDPR, HIPAA, SOX, PCI-DSS)
- destructive bulk schema/table operations (truncate/purge tables/schemas)

Existing safe-allowlists keep working: product-feature questions about
credentials/branches and the explicit `_blocker_for` patterns are
unchanged. REPO_FACT/USER_GOAL-backed answers also pass through
without gating.

Refs Q00#640
Address PR Q00#695 blocking review finding:

The post-routing gate keyed off broad keywords, so meta-questions like
"What acceptance criteria should the HIPAA worker satisfy?" or
"Which command output verifies the GDPR export flow?" were rejected
as if they asked for regulated-data handling decisions. They actually
hit the `_feature_acceptance_answer` / `_verification_answer` routes
which return safe templates regardless of subject keywords.

Restructure `answer()` so the gate only fires after generative routes
(actor/IO, runtime, product behavior, default).  Meta-routes
(non-goal listing, verification, feature acceptance) return early
without going through the gate.

Add regression coverage for HIPAA/GDPR/PII acceptance and verification
phrasings to ensure they keep returning CONSERVATIVE_DEFAULT answers.
Existing PII/HIPAA generative-route block tests continue to pass.

Refs Q00#640
Address PR Q00#695 follow-up: a regulated runtime question without a
supplied repo_fact, e.g. "Which runtime should the HIPAA worker use?",
routes through `_runtime_answer` and returns
`AutoAnswerSource.EXISTING_CONVENTION` with the generic
"use the existing repository runtime" template.  Because
EXISTING_CONVENTION was not in `_RISKY_FALLBACK_SOURCES`, that fallback
escaped the gate.

Add EXISTING_CONVENTION to the risky-fallback set.  REPO_FACT-backed
runtime answers (full `runtime_context` supplied) remain unaffected,
and the existing `does_not_block_regulated_topic_when_repo_fact_supplied`
test continues to assert that REPO_FACT answers pass through.

Add a regression test that covers the bot's exact case: HIPAA runtime
question with no supplied facts now blocks with reason
"regulated data handling".

Refs Q00#640
Address PR Q00#695 follow-up: the destructive-operation matcher only
caught ``verb ... noun`` phrasings such as ``purge tables``, so
reverse phrasings like ``Which tables should the migration truncate?``
slipped through. The verb vocabulary was also narrow.

Expand patterns:
- Verbs: ``truncate``, ``purge``, ``wipe`` plus tense variants
  (``truncates``/``truncating``/``truncated`` etc.).
- Nouns: ``table(s)``, ``schema(s)``, ``database(s)``, ``index/indexes/indices``,
  ``migration(s)``.
- Both verb-then-noun and noun-then-verb directions matched.

Note: ``drop ... database`` remains owned by ``_blocker_for`` (its
existing branch fires first), and product-feature questions are still
exempted by the safe-product allowlists, so this does not over-gate
benign feature semantics.

Add a regression test exercising both phrasing directions across the
new verb/noun vocabulary.

Refs Q00#640
Address PR Q00#695 follow-up: phrasings like "Which tables should the
migration drop?" or "Should we erase these schemas before re-seeding?"
still flowed through to a generic auto answer because ``drop`` and
``erase`` were missing from `_DESTRUCTIVE_BULK_VERBS`.

`_blocker_for` already handles ``drop|delete|erase|wipe ... database``
at the explicit-authority layer, so this matcher and the existing
allow/deny list both treat the same families consistently. The
risky-fallback gate runs after `_blocker_for`, so explicit
authority-style prompts continue to block via the original code path.

Add ``drop`` and ``erase`` (with tense variants) and extend the
regression test to assert both verb-then-noun and noun-then-verb
phrasings using the new vocabulary.

Refs Q00#640
Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Reviewing commit 326563c for PR #738

Review record: dcffd5cd-c11a-4eeb-9135-908557c3bbdb

Blocking Findings

| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.

---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:716 | BLOCKING | The new risky-fallback gate blocks every fallback/product-behavior answer that merely mentions PII, HIPAA, GDPR, etc., even when the question is only asking for product semantics rather than compliance policy. For example, Should the app export PII reports? or Should users be able to download GDPR exports? now return BLOCKER instead of preserving the requested feature behavior. That is a regression in normal auto-answer coverage, and there is no allowlist/test for regulated-data product questions comparable to the credential-specific safe path. |

Follow-up Findings

  • src/ouroboros/auto/answerer.py:666 [warning] _DESTRUCTIVE_BULK_NOUNS now includes bare migration/migrations and index/indexes, and the regexes at lines 685/690 treat any pairing with drop/wipe/erase as a destructive bulk-data operation. That overblocks innocuous questions such as Which migration should we drop from the release plan? or Which indexes should we drop from the docs?, which are not schema/data-destruction requests. The matcher needs tighter context than noun+verb anywhere in the sentence.

Non-blocking Suggestions

None.

Design Notes

The route ordering is sensible, and gating only fallback-style answers is the right general shape. The issue is the new regex layer is broader than the intent described in the docstrings, so it now overrides legitimate product-semantics questions instead of only catching unsafe generic defaults.


Reviewed by ouroboros-agent[bot] via Codex deep analysis

shaun0927 and others added 2 commits May 7, 2026 22:54
…#640)

Add _DESTRUCTIVE_BULK_NON_DATA_QUALIFIERS to exempt verb/noun pairs that
appear with a non-data artefact qualifier (release plan, docs, roadmap,
etc.) from the destructive-bulk blocker.  Also extend _DESTRUCTIVE_BULK_NOUNS
with record/row/audit-log/audit-trail strong data-object nouns.

Addresses ouroboros-agent[bot] follow-up warning on Q00#738: bare
``migration`` + ``drop`` and ``index`` + ``drop`` questions about release
plans or documentation were overblocked.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…ate (Q00#640)

Add _is_safe_product_regulated_question() allowlist that passes through
bounded product-behavior questions mentioning regulated nouns (PII/GDPR/
HIPAA/SOX/PCI-DSS) when paired with a product-semantics verb (export,
download, display, show, view, access, …) and NOT a compliance-policy verb
(store, handle, retain, collect, encrypt, …).

Questions like "Should the app export PII reports?" or "Should users be
able to download GDPR exports?" are feature-level requirements and must
not be blocked; questions asking how to store/handle/retain regulated data
still block as before.

Addresses ouroboros-agent[bot] BLOCKING on Q00#738 (answerer.py:716).

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
@shaun0927
Copy link
Copy Markdown
Collaborator Author

Addressed both bot CHANGES_REQUESTED items in two follow-up commits:

BLOCKING (answerer.py:716) — regulated-topic product-semantics overblocking
Commit a98eb10d: Add _is_safe_product_regulated_question() allowlist. Questions with a product-semantics verb (export, download, display, show, view, access, …) paired with a regulated noun (PII/GDPR/HIPAA/SOX/PCI-DSS) and no compliance-policy verb (store, handle, retain, collect, encrypt, …) now pass through to generative product answers instead of BLOCKER. Compliance-policy phrasings still block as before.

Confirmed: Should the app export PII reports? → NOT blocked. What PII should the system collect? → still BLOCKER.

Follow-up warning (answerer.py:666) — destructive-bulk regex overblocking process artefacts
Commit 2feb4664: Add _DESTRUCTIVE_BULK_NON_DATA_QUALIFIERS that exempts verb/noun pairs when the sentence also contains a non-data context qualifier (release plan, from the docs, from the documentation, from the plan, from the roadmap, etc.).

Confirmed: Which migration should we drop from the release plan? → NOT blocked. Which migrations should we wipe before redeploying? → still BLOCKER.

New tests (all passing, 66 total):

  • test_auto_answerer_allows_product_semantics_questions_for_regulated_topics
  • test_auto_answerer_still_blocks_compliance_policy_regulated_questions
  • test_auto_answerer_allows_release_plan_drop_question
  • test_auto_answerer_allows_docs_index_drop_question

Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Reviewing commit a98eb10 for PR #738

Review record: f9cda100-5968-4e6d-8549-5d6f504b3ad0

Blocking Findings

| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.

---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:741 | BLOCKING | _is_safe_product_regulated_question() is broader than _is_product_behavior_question(), but answer() still only sends product questions to _product_behavior_answer() via the older matcher at lines 122-125. In practice, several newly “allowed” cases like Should users be able to download GDPR exports?, Should admins be able to view PII fields in the admin panel?, and Should the app allow users to access their GDPR data? no longer block, but they fall through to _default_answer() instead of preserving the requested feature semantics. That silently drops the regulated-product requirement from the ledger/acceptance criteria, which is a wrong auto-answer outcome rather than a safe pass-through. |

Non-blocking Suggestions

None.

Design Notes

The new post-routing risk gate is a sensible shape for catching unsafe generic fallbacks, but the new allowlist is not aligned with the existing routing classifier. That leaves a gap where “safe” regulated product questions are unblocked without being handled semantically.


Reviewed by ouroboros-agent[bot] via Codex deep analysis

…avior answerer (Q00#640)

Extend _is_product_behavior_question() with a new arm covering the
product-semantics verbs used by _is_safe_product_regulated_question()
that were not previously matched (download, allow, expose, render, enable,
support) and the "be able to <verb>" phrasing gap for view/access.

Previously, questions allowed past the risky-fallback gate (e.g.
"Should users be able to download GDPR exports?") fell through to
_default_answer(), producing a generic conservative-MVP ledger entry
that silently discarded the regulated-product feature semantics.  With
this fix the router at answerer.py:122 sends those questions to
_product_behavior_answer(), which writes subject-specific
constraints.behavior.* and acceptance.behavior.* ledger entries that
preserve the requested feature in the Seed contract.

New test: test_auto_answerer_routes_safe_regulated_product_questions_to_product_behavior_answerer
asserts blocker=None, source=CONSERVATIVE_DEFAULT, subject-specific
ledger keys (not conservative_mvp), and regulated noun present in
answer text/ledger entries for all three bot example questions.

Addresses ouroboros-agent[bot] BLOCKING on Q00#738 (answerer.py:741).

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
@shaun0927
Copy link
Copy Markdown
Collaborator Author

Addressed BLOCKING on #738 (answerer.py:741) in commit 99c45805.

Root cause confirmed: _is_safe_product_regulated_question() is broader than _is_product_behavior_question(). The three bot example questions (Should users be able to download GDPR exports?, Should admins be able to view PII fields in the admin panel?, Should the app allow users to access their GDPR data?) all had safe_regulated=True but product_behavior=False, so they fell to _default_answer() — discarding the regulated-product feature semantics in the ledger.

Fix: Extended _is_product_behavior_question() with one additional re.search arm:

or re.search(
    r"\b(should|must|can|will|do|does|is|are)\b.+\b(be able to\s+)?"
    r"(download|allow|expose|render|enable|support|view|access)\b",
    lowered,
)

This covers the product-semantics verbs used by the allowlist that were not previously matched, and fixes the "be able to <verb>" phrasing gap for view/access.

Verified end-to-end: All three questions now:

  • blocker = None (gate passes)
  • source = CONSERVATIVE_DEFAULT (routes to _product_behavior_answer())
  • Ledger updates contain constraints.behavior.* and acceptance.behavior.* keys (subject-specific, not generic conservative_mvp)
  • Answer text and ledger values contain the regulated noun (gdpr/pii)

New test: test_auto_answerer_routes_safe_regulated_product_questions_to_product_behavior_answerer (test file line 1195) locks all four assertions above for all three bot example questions.

67 tests passing (66 previous + 1 new). Ruff clean.

Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Reviewing commit 99c4580 for PR #738

Review record: 80809b43-2705-46e8-91f3-228ede4f7574

Blocking Findings

| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.

---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:688 | BLOCKING | _DESTRUCTIVE_BULK_NON_DATA_QUALIFIERS now exempts any destructive bulk question that merely contains the word documentation, even if the question is still asking for a real schema/data deletion. For example, Which tables should we drop according to the documentation before redeploying? matches the destructive tables ... drop pattern, but the bare documentation token suppresses the blocker and lets auto-answering proceed. That is a regression in the safety gate for destructive operations; the qualifier needs to be tied to clearly non-destructive phrasings such as from the docs/documentation, not any mention of documentation anywhere in the sentence. |

Non-blocking Suggestions

None.

Design Notes

The PR is directionally correct: it adds a second-stage safety gate for risky generative fallbacks and then restores intended routing for safe regulated product-behavior questions. The main architectural weakness is that the destructive-operation exemption is implemented as a broad keyword bypass rather than a narrowly scoped phrase check, which makes the safety boundary too easy to evade.


Reviewed by ouroboros-agent[bot] via Codex deep analysis

Previously the destructive-bulk exemption matched on bare tokens such as
``documentation`` or ``release plan`` anywhere in the sentence, which let
real destructive operations slip past the gate when the question merely
*referenced* documentation as an authority (e.g. "Which tables should we
drop according to the documentation before redeploying?").

The qualifier is now strictly phrase-scoped to ``from the …`` so the
exemption fires only when the artefact is the explicit object of the
drop/wipe — the phrasing that signals "remove an entry from a process
artefact" rather than "delete data from a system". Authority/reference
phrasings ("according to the documentation", "per the release plan",
"in the documentation example") no longer suppress the gate.

Existing pass-through tests still hold:
- "Which migration should we drop from the release plan?" → not blocked
- "Which indexes should we drop from the docs?" → not blocked

New regression test locks the safety boundary:
- "Which tables should we drop according to the documentation …" → BLOCKER
- "Which tables should we drop per the release plan?" → BLOCKER
- "Per the documentation, which audit logs should we purge?" → BLOCKER
- "According to the docs, which tables should we drop?" → BLOCKER

Ref: ouroboros-agent[bot] BLOCKING on PR Q00#738 — answerer.py:688.
68 tests passing in test_ledger_grading_answerer.py (337 in tests/unit/auto). Ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Reviewing commit fc11788 for PR #738

Review record: c633fd73-4dda-43d8-80e5-b99ca2b37ee9

Blocking Findings

| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.

---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:698 | BLOCKING | The new destructive-bulk exemption only matches from the ..., so the same safe process-artifact questions phrased as in the ... still get blocked as data-destruction. For example, Which indexes should we drop in the docs? and Which migration should we drop in the roadmap? still satisfy the destructive noun/verb patterns and miss the qualifier entirely. That is a wrong-result regression in the exact false-positive area this patch is trying to fix. |
| 2 | src/ouroboros/auto/answerer.py:782 | BLOCKING | _is_safe_product_regulated_question() disables the regulated-product allowlist whenever any compliance-policy verb appears anywhere in the sentence. That over-blocks legitimate feature-semantics questions about already-existing regulated data, e.g. Should admins be able to view stored PII fields? or Should the dashboard display encrypted HIPAA files? Both are product-behavior questions, but stored/encrypted makes them fall through to the risky-fallback blocker and return regulated data handling. |

Non-blocking Suggestions

None.

Design Notes

The post-routing risky-fallback gate is a reasonable direction and the added tests improve coverage for earlier false positives. The remaining problems are both matcher-shape issues: the new exemptions are still too literal in one place and too broad in another, so the design needs slightly more context-sensitive patterning before this is safe.


Reviewed by ouroboros-agent[bot] via Codex deep analysis

…erbs (Q00#640)

Two BLOCKING regressions raised by ouroboros-agent[bot] on the previous fix
commit (fc11788):

1) **answerer.py:698** — destructive-bulk exemption only matched ``from the …``
   so safe process-artefact phrasings such as ``Which indexes should we drop
   in the docs?`` and ``Which migration should we drop in the roadmap?`` were
   still mis-blocked as data destruction. The qualifier now also accepts
   ``in the …`` for the same artefact list (``release plan``, ``docs``,
   ``documentation``, ``plan``, ``roadmap``, ``backlog``, ``changelog``,
   ``spec``).  Authority/reference phrasings (``according to the docs``,
   ``per the release plan``) still do not match the qualifier and remain
   blocked, locked in by an expanded regression test.

2) **answerer.py:782** — ``_is_safe_product_regulated_question()`` rejected any
   question containing a compliance-policy verb (``store``, ``handle``,
   ``encrypt``, ``share``, …) anywhere in the sentence. That over-blocked
   legitimate product-behavior questions where the compliance verb appeared as
   a past-participle adjective modifying the noun, e.g.

     Should admins be able to view stored PII fields?
     Should the dashboard display encrypted HIPAA files?

   In both, the main verb is product-semantics (``view`` / ``display``); the
   compliance verb is adjectival. The allowlist now requires (a) a regulated
   noun, (b) a product-question modal, and (c) a product-semantics verb. Pure
   compliance-policy phrasings (``How should the system handle GDPR data
   retention?``, ``What PII should the system collect?``) lack a
   product-semantics verb and remain blocked — covered by the existing
   ``test_auto_answerer_still_blocks_compliance_policy_regulated_questions``.

   The previously-defined ``_COMPLIANCE_POLICY_VERBS_RE`` constant is now
   unused and removed to avoid dead code.

New regression coverage:
- ``test_auto_answerer_allows_in_the_artefact_drop_questions`` — locks in
  ``in the docs/roadmap/release plan/changelog`` exemption.
- ``test_auto_answerer_allows_product_questions_with_adjectival_compliance_verbs``
  — locks in ``view stored PII``, ``display encrypted HIPAA files``,
  ``download retained GDPR exports``, etc.

Existing safety tests (compliance-policy questions, authority-reference
phrasings) all continue to block.

339 unit tests passing in tests/unit/auto. Ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Reviewing commit e846a47 for PR #738

Review record: 75181131-a43b-4cd5-9d16-a6525991c0cc

Blocking Findings

| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.

---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:750 | BLOCKING | _is_safe_product_regulated_question() is now too broad: it allows any regulated-data question that contains a modal plus one “product” verb, even when the same sentence also asks for compliance-policy decisions. For example, How should the system store and display HIPAA files? or Should we retain and export PII records? will pass this allowlist, skip _risky_fallback_blocker_for(), and be auto-answered via _product_behavior_answer() despite still requiring a decision about regulated-data storage/retention. That is exactly the class of question this gate is supposed to block. The current tests only cover pure product questions and pure compliance questions, so this mixed case regression is not caught. |

Non-blocking Suggestions

None.

Design Notes

The routing cleanup is sensible and the new tests cover the prior review threads well, but the regulated-topic allowlist currently relies on positive keyword presence alone. That makes the safety boundary fragile for mixed-intent questions; this path needs an explicit negative check or precedence rule for compliance-policy verbs.


Reviewed by ouroboros-agent[bot] via Codex deep analysis

…cedence (Q00#640)

Bot follow-up on commit e846a47: the regulated-product allowlist was too
permissive. Mixed-intent questions that pair a compliance-policy verb with a
product-semantics verb — e.g.

    How should the system store and display HIPAA files?
    Should we retain and export PII records?

still ask the auto pipeline to decide regulated-data handling and must remain
blocked, even though they also mention a product verb.

The fix adds an explicit precedence rule: an *active*-form compliance-policy
verb (``store`` / ``stores`` / ``storing``, ``retain`` / ``retains`` /
``retaining``, ``encrypt``, ``handle``, ``collect``, ``share``, ``transmit``,
``disclose``, ``process``, ``manage``, ``govern``) blocks the question even if
a product-semantics verb is also present.

Past-participle forms (``stored``, ``encrypted``, ``retained``, ``collected``,
``shared``, …) are intentionally excluded from the negative list because they
act adjectivally on the regulated noun (``view stored PII``, ``display
encrypted HIPAA files``); the main verb of those sentences is the
product-semantics one and the question is product-behavior over already-
existing regulated data, not a compliance-policy decision.

New regression test ``test_auto_answerer_blocks_mixed_intent_regulated_questions``
locks the precedence rule on the bot's own examples plus three more variants
covering ``encrypt`` / ``share`` / ``collect``. Existing positive tests
(adjectival compliance verbs, pure product semantics) and existing negative
tests (pure compliance phrasings) all continue to pass.

340 unit tests passing in tests/unit/auto. Ruff clean.

Ref: ouroboros-agent[bot] BLOCKING on Q00#738 — ``answerer.py:750``.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Reviewing commit 7ed761c for PR #738

Review record: e97aeaa5-4f22-424b-91cf-b76ecc77ef3f

Blocking Findings

| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.

---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:548 | BLOCKING | The router fix is still incomplete for two verbs that _is_safe_product_regulated_question() explicitly allows: export and show are in _PRODUCT_SEMANTICS_REGULATED_VERBS_RE but not in the new _is_product_behavior_question() branch. As a result, questions like Should the app export PII reports? or Should the app show PII reports? bypass the blocker and then silently fall through to _default_answer() instead of _product_behavior_answer(), losing the regulated-feature semantics in the ledger. This is the same class of bug the follow-up commit fixed for download/view/access, and it remains user-visible for the remaining allowed verbs. |

Non-blocking Suggestions

| 1 | tests/unit/auto/test_ledger_grading_answerer.py:1200 | Nice-to-have tests | The regulated-topic allowlist test only checks not BLOCKER. Adding route/ledger-shape assertions for export and show would have caught the remaining mismatch with _is_product_behavior_question(). |

Design Notes

The overall gating approach is reasonable: explicit hard blockers first, then a second-pass guard for risky generic fallbacks. The remaining issue is contract drift between the safe-allowlist and the router heuristics; those two verb vocabularies need to stay aligned or the pipeline preserves safety while still dropping intended product semantics.


Reviewed by ouroboros-agent[bot] via Codex deep analysis

Bot follow-up on commit 7ed761c: ``_PRODUCT_SEMANTICS_REGULATED_VERBS_RE``
allows ``export`` and ``show`` (and the rest of the safe-allowlist set), but
the explicit alignment branch in ``_is_product_behavior_question()`` only
listed a subset (``download/allow/expose/render/enable/support/view/access``).
``export`` / ``show`` / ``display`` were still matched by an earlier broader
pattern in the same function, but the visible alignment was incomplete and
prone to silent drift.

The router branch added to bridge ``_is_safe_product_regulated_question()``
into ``_is_product_behavior_question()`` now lists every verb in the
allowlist:

    export | download | render | display | show | expose | support |
    enable | allow | view | access

This is a no-op for already-routed verbs but makes the allowlist↔router
contract explicit and grep-checkable, eliminating the drift surface flagged in
the bot's design note.

Test changes: ``test_auto_answerer_routes_safe_regulated_product_questions_to_product_behavior_answerer``
now exercises every verb in the allowlist (export, show, display, render,
expose, support, enable in addition to download/view/access). Each case
asserts (a) the gate passes (``answer.blocker is None``), (b) the router
takes the product-behavior path (``constraints.behavior.*`` and
``acceptance.behavior.*`` ledger keys, not the generic
``constraints.conservative_mvp`` from ``_default_answer()``), and (c) the
regulated noun is preserved in the answer text or ledger value.

340 unit tests passing in tests/unit/auto. Ruff clean.

Ref: ouroboros-agent[bot] BLOCKING on Q00#738 — ``answerer.py:548``.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Reviewing commit 52ef5ab for PR #738

Review record: a69ddee7-e88d-4d17-adee-25cd24c2af55

Blocking Findings

| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.

---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:837 | BLOCKING | _is_safe_product_regulated_question() now suppresses the regulated-data blocker for any question that happens to mention a regulated noun plus a product verb, even when the router sends that question down a different path first. Because answer() checks _is_actor_or_io_question() and _is_runtime_context_question() before _is_product_behavior_question() (lines 118-123), prompts like What inputs should the GDPR export take? or Which runtime should the GDPR export use? will bypass the new blocker and get a generic IO/runtime answer instead of a blocker or a feature-specific product answer. That reintroduces the same “safe allowlist broader than routing” bug in another form, and the new tests only cover the default/product-behavior path. |

Non-blocking Suggestions

None.

Design Notes

The risky-fallback gate is a reasonable place for this policy, but the new regulated-data allowlist is route-agnostic while the router is not. The bypass needs to be tied to the actual selected answer path, or the router needs to classify regulated product questions before the generic IO/runtime branches.


Reviewed by ouroboros-agent[bot] via Codex deep analysis

…es (Q00#640)

Bot follow-up on commit 52ef5ab: ``_is_safe_product_regulated_question()``
suppressed the risky-fallback blocker for any regulated-noun + product-verb
combination, but the router checked ``_is_actor_or_io_question`` and
``_is_runtime_context_question`` *before* ``_is_product_behavior_question``,
so prompts like

    What inputs should the GDPR export take?
    Which runtime should the GDPR export use?

got a generic IO/runtime answer (``ASSUMPTION`` / ``EXISTING_CONVENTION``)
and then bypassed the blocker via the safe-allowlist — silently dropping the
regulated-feature semantics from the ledger.

Fix: pull ``_is_safe_product_regulated_question`` to the top of the
content-routing chain so any regulated-product question — IO-shaped,
runtime-shaped, or product-shaped — is dispatched to
``_product_behavior_answer()``. The risky-fallback gate at the tail of
``answer()`` already consults the same predicate, so the router and the
safe-allowlist now share a single answer path.

Pure compliance phrasings remain blocked unchanged: they fail the
allowlist (no product-semantics verb) and fall through to the previous
branches, where the risky-fallback gate fires for any
``CONSERVATIVE_DEFAULT`` / ``ASSUMPTION`` / ``EXISTING_CONVENTION`` source.

New regression test ``test_auto_answerer_routes_regulated_product_questions_before_io_or_runtime``
locks in:
- Bot's example "What inputs should the GDPR export take?"
- Bot's example "Which runtime should the GDPR export use?"
- Two adjacent IO/runtime regulated-product variants

Each case asserts (a) not blocked, (b) answer comes from
``_product_behavior_answer()`` (subject-specific ``behavior.*`` ledger keys,
no IO/runtime keys), and (c) the regulated noun is preserved in the answer
text or ledger value.

341 unit tests passing in tests/unit/auto. Ruff clean. The two failures in
tests/unit/orchestrator/test_codex_cli_runtime.py are pre-existing and
unrelated to this PR's scope (verified by stashing the patch).

Ref: ouroboros-agent[bot] BLOCKING on Q00#738 — ``answerer.py:837``.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Reviewing commit 44e405f for PR #738

Review record: a6525e8d-fc68-402c-859b-00e2586a68c4

Blocking Findings

| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.

---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:126 | BLOCKING | The new _is_safe_product_regulated_question() branch runs before _is_runtime_context_question() unconditionally, so regulated runtime questions that also contain a product noun now ignore concrete repo facts. For example, answer("Which runtime should the GDPR export use?", ..., context=AutoAnswerContext(repo_facts={"runtime_context": ...})) will always return _product_behavior_answer() instead of the REPO_FACT runtime answer, dropping grounded runtime_context evidence and replacing it with a generic behavior entry. That is a behavioral regression from the existing runtime contract and should be covered by a test. |

Non-blocking Suggestions

None.

Design Notes

The risky-fallback gate and the destructive-bulk qualifier tightening are reasonable, but the router is now conflating “regulated product question” with “always answer as product behavior.” That coupling is too strong; the safe-allowlist and the answer-route choice need to preserve higher-priority grounded facts such as runtime_context.


Reviewed by ouroboros-agent[bot] via Codex deep analysis

shaun0927 and others added 2 commits May 8, 2026 01:26
Q00#640)

Bot follow-up on commit 44e405f: the unconditional early route to
``_product_behavior_answer()`` for any question that
``_is_safe_product_regulated_question()`` recognised broke the existing
runtime contract. With a supplied ``runtime_context`` repo fact, a question
like ``Which runtime should the GDPR export use?`` should return a
``REPO_FACT`` runtime answer carrying the grounded evidence; the early
route replaced it with a generic product-behavior entry, dropping the
evidence.

Restructure: keep the original IO/runtime/product/default order so grounded
``REPO_FACT`` answers stay on the runtime path, then re-route to
``_product_behavior_answer()`` only when the chosen route produced a
non-grounded fallback (``ASSUMPTION`` / ``EXISTING_CONVENTION`` /
``CONSERVATIVE_DEFAULT``) AND the safe-allowlist recognises the question
as regulated-product. Concretely:

- ``Which runtime should the GDPR export use?`` + REPO_FACT → REPO_FACT
  runtime answer (preserved, with evidence).
- ``Which runtime should the GDPR export use?`` without repo facts →
  EXISTING_CONVENTION runtime fallback re-routed through
  ``_product_behavior_answer()`` so the regulated-feature semantics are
  preserved in ``constraints.behavior.*`` / ``acceptance.behavior.*``.
- ``What inputs should the GDPR export take?`` → IO ASSUMPTION
  re-routed to ``_product_behavior_answer()``.
- ``Should the app export PII reports?`` → already routes through
  ``_product_behavior_answer()`` (CONSERVATIVE_DEFAULT) and is left
  untouched by the reroute.

Pure compliance phrasings still block: they fail the allowlist (no
product-semantics verb), keep their CONSERVATIVE_DEFAULT/ASSUMPTION/
EXISTING_CONVENTION source, and the risky-fallback gate fires for them.

New regression test
``test_auto_answerer_preserves_repo_fact_for_regulated_runtime_question``
locks the REPO_FACT preservation contract: with a runtime_context repo
fact supplied, the answer must be REPO_FACT, must contain the supplied
runtime text, and must carry a runtime_context ledger entry with the
supplied evidence.

342 unit tests passing in tests/unit/auto. Ruff clean.

Ref: ouroboros-agent[bot] BLOCKING on Q00#738 — ``answerer.py:126``.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
The previous commit (046ce3d) accidentally included an empty local debug
artifact via ``git add -A``. Removing it; not part of the PR scope.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: REQUEST_CHANGES

Reviewing commit a13fd6c for PR #738

Review record: d5f1de56-27c8-458e-9db9-a539c5ac7d4b

Blocking Findings

| # | File:Line | Severity | Finding |
|### Recovery Notes
First recoverable review artifact generated from codex analysis log.

---|-----------|----------|---------|
| 1 | src/ouroboros/auto/answerer.py:793 | BLOCKING | _is_safe_product_regulated_question() now treats any regulated noun plus verbs like support, enable, or allow as a safe product-behavior question. That lets broad compliance-scope prompts such as Should the platform support HIPAA? or Should the app enable GDPR? bypass the blocker and produce a generated MVP answer, even though those are still regulated-policy decisions rather than bounded feature semantics. The allowlist needs an additional constraint so it only passes concrete artifact/feature questions, not bare “support HIPAA/GDPR” requests. |
| 2 | src/ouroboros/auto/answerer.py:718 | BLOCKING | The destructive-operation exemption regex matches `from |

Non-blocking Suggestions

None.

Design Notes

The routing change is directionally right: it separates grounded answers from risky fallbacks and preserves regulated-product semantics better than the previous default-path behavior. The remaining issue is that both new regex allowlists are still too lexical, so a few high-risk meanings now slip through despite the added tests.


Reviewed by ouroboros-agent[bot] via Codex deep analysis

Two BLOCKING items raised by ouroboros-agent[bot] on commit a13fd6c:

(1) ``answerer.py:793`` — ``_is_safe_product_regulated_question()`` allowed
    "compliance-scope-as-feature-flag" prompts (``Should the platform support
    HIPAA?``, ``Should the app enable GDPR?``, ``Should the system allow
    PII?``) to bypass the blocker, even though those frame the entire
    regulatory regime as a binary toggle and are compliance-policy decisions.

    Fix: a new ``_BARE_COMPLIANCE_SCOPE_RE`` rejects ``support|enable|allow``
    + bare regulated noun followed by no qualifying feature noun (negative
    lookahead ``(?!\s+[a-z])``). Concrete-feature variants ("HIPAA audit
    logs", "GDPR consent banners", "PII redaction in exports", "GDPR data")
    have a qualifying noun and still pass through.

(2) ``answerer.py:718`` — the destructive-bulk artefact qualifier listed
    standalone ``doc`` and ``plan`` tokens. ``from the doc`` is rare phrasing
    (use ``docs`` / ``documentation``) and bare ``plan`` collides with
    database-side meanings (query plan, execution plan, db plan), so a
    question like "Which tables should we drop from the plan?" was being
    exempted as a process-artefact edit.

    Fix: drop ``doc`` and ``plan`` (singular) from the artefact list. The
    remaining unambiguous artefacts are ``release plan``, ``docs``,
    ``documentation``, ``roadmap``, ``backlog``, ``changelog``, ``spec``.
    All existing positive tests already use these unambiguous variants.

New regression coverage:
- ``test_auto_answerer_blocks_bare_compliance_scope_questions`` — locks
  rejection of bare ``support|enable|allow + regulated noun`` for all five
  regulated-noun variants.
- ``test_auto_answerer_allows_qualified_compliance_scope_questions`` —
  locks pass-through of ``support HIPAA audit logs`` /
  ``enable GDPR consent banners`` / ``allow PII redaction`` / etc.
- ``test_auto_answerer_blocks_destructive_bulk_with_ambiguous_singular_tokens``
  — locks blocker for ``from the plan`` / ``in the plan`` / ``from the doc``
  destructive prompts.

345 unit tests passing in tests/unit/auto. Ruff clean.

Ref: ouroboros-agent[bot] BLOCKING on Q00#738 — ``answerer.py:793`` and
``answerer.py:718``.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Copy link
Copy Markdown
Contributor

@ouroboros-agent ouroboros-agent Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — ouroboros-agent[bot]

Verdict: APPROVE

Reviewing commit bd8cd7c for PR #738

Review record: 65d72a77-6bfa-4d58-adee-5a1d07b88eab

Blocking Findings

No in-scope blocking findings remained after policy filtering.

Non-blocking Suggestions

None.

Design Notes

The router change is coherent: it preserves grounded REPO_FACT answers, re-routes only risky fallback sources, and the added tests cover the previously-missed regulated product-semantics paths plus the destructive-bulk qualifier edge cases. I did not find a remaining diff-scoped correctness issue that warrants blocking.

Recovery Notes

First recoverable review artifact generated from codex analysis log.


Reviewed by ouroboros-agent[bot] via Codex deep analysis

@shaun0927
Copy link
Copy Markdown
Collaborator Author

@Q00 — re-review ping. ouroboros-agent[bot] APPROVED the latest head (bd8cd7c1) with no remaining blocking or non-blocking findings, all CI green, branch MERGEABLE / CLEAN. Summary of what landed and why this is safe to merge:

What this PR does

A redo of #695 (which you previously approved as "narrows automation for regulated/destructive topics, reduces scope risk"). Adds a second-stage risky-fallback gate in src/ouroboros/auto/answerer.py that blocks generative auto-answers for two classes of high-risk question:

  1. Regulated personal data / compliance regimes — PII, GDPR, HIPAA, SOX, PCI-DSS.
  2. Destructive bulk schema/data operationstruncate / purge / wipe / drop / erase against table / schema / database / record / row / audit log / index / migration.

The gate fires only on generative answer routes (CONSERVATIVE_DEFAULT / ASSUMPTION / EXISTING_CONVENTION); grounded REPO_FACT runtime answers are untouched. Meta routes (non-goal, verification, feature-acceptance) are decided earlier and never reach the gate. Closing criteria 2/2 for #640 (paired with the provenance meta-only PR for the first half).

Bot iteration timeline (all resolved)

The bot raised eight successive blocking concerns; each was addressed by an in-scope follow-up commit and re-verified by the next bot pass:

# Commit Bot blocker (location) Fix
1 326563c regulated-topic gate over-blocked product-semantics questions (answerer.py:716) a98eb10 — added _is_safe_product_regulated_question() allowlist
2 a98eb10 safe regulated-product questions fell through to _default_answer() (answerer.py:741) 99c4580 — extended _is_product_behavior_question() to cover the same verbs
3 99c4580 bare documentation token bypassed destructive-bulk gate (answerer.py:688) fc11788 — phrase-scoped qualifier to from the …
4 fc11788 in the … artefact phrasings still mis-blocked + adjectival compliance verbs over-blocked (answerer.py:698, :782) e846a47 — accept in the …, drop the always-on compliance-verb negative
5 e846a47 mixed-intent store + display HIPAA slipped past allowlist (answerer.py:750) 7ed761c — active-form compliance-verb precedence
6 7ed761c router/allowlist verb vocabularies could drift (answerer.py:548) 52ef5ab — explicit alignment + route-shape assertions for every allowlist verb
7 52ef5ab regulated-product IO/runtime questions lost feature semantics (answerer.py:837) 44e405f046ce3d6a13fd6c — re-route only when route source is in _RISKY_FALLBACK_SOURCES, preserving grounded REPO_FACT
8 a13fd6c bare-scope `support enable

The bot's final approval explicitly acknowledges: "The router change is coherent: it preserves grounded REPO_FACT answers, re-routes only risky fallback sources, and the added tests cover the previously-missed regulated product-semantics paths plus the destructive-bulk qualifier edge cases."

Why it's safe to merge

  • Tests: 345 unit tests passing in tests/unit/auto, including ten new regression tests that lock in every bot finding above. Coverage spans the safe allowlist verb set (export/show/display/render/expose/support/enable/download/view/access), adjectival vs active compliance verbs, mixed-intent rejection, bare-scope rejection, REPO_FACT preservation, and phrase-scoped artefact qualifiers.
  • CI: Ruff, MyPy, Bridge TypeScript, and Python 3.12/3.13/3.14 test workflows are all green on bd8cd7c1.
  • Safety boundary is tighter than feat(auto): block risky-fallback answers for regulated topics #695's approval state: every false-positive and false-negative the bot found has explicit phrase-scoped logic and a regression test, so future regex churn fails loudly instead of silently.
  • No grounded-answer regression: a dedicated test (test_auto_answerer_preserves_repo_fact_for_regulated_runtime_question) asserts that supplying runtime_context repo facts still produces a REPO_FACT answer with the original evidence, so the existing runtime/IO contract is preserved.
  • Branch state: MERGEABLE / CLEAN, reviewDecision: APPROVED.

Ready for your final pass.

@shaun0927 shaun0927 merged commit 5f39067 into Q00:main May 7, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant