fix(guardrails): redact union of overlapping spans, never leak sensitive bookends by mayankbharati-ops · Pull Request #780 · Tracer-Cloud/opensre

mayankbharati-ops · 2026-04-23T05:15:59Z

Summary

GuardrailEngine.apply walked sorted redact matches right-to-left with a single seen_end cursor and skipped any match whose end exceeded that cursor. When a wider match overlapped — or fully contained — an already-redacted narrower match, the wider match was silently dropped and its prefix / suffix survived in the output.

Concretely, with two rules (wide: super_secret_token_value, inner: secret_token):

	Output
Before (`main`)	`\"data super_[REDACTED:short]_value end\"`
After (this PR)	`\"data [REDACTED:long] end\"`

The super_ and _value fragments of the wider keyword — arguably the sensitive bookends the wider rule exists to cover — leaked into the output. PR #520's earlier fix handled only the same-start subcase (via the secondary sort key); matches with different start offsets that fully contained a narrower match still slipped through.

This also affects the shipped starter rules. With _STARTER_CONFIG in app/guardrails/cli.py, text like config: api_key=AKIAIOSFODNN7EXAMPLE triggers both aws_access_key ([16:36]) and generic_api_token ([8:36]). Before this PR the prefix api_key= leaked; after, the whole span is redacted. Same for aws_secret_access_key=AKIA...xxxxx... where the label and the 40-char tail both leaked pre-fix.

Fix

Replace the single-cursor walk with a proper interval merge:

Sort redact matches by (start ASC, -end) so same-start ties put the widest match first — keeps the "longest-keyword-wins" behavior PR fix: overlapping keyword redaction and hardcoded investigation loop limit #520 restored.
Sweep left-to-right, merging any match whose start falls before the current interval's end. The representative rule name for the merged interval is the one with the largest original width.
Apply replacements right-to-left over the merged intervals so string indices stay valid as each redaction resizes the output.

Adjacent matches (A.end == B.start) intentionally do not merge — they touch but do not overlap. Disjoint matches produce independent redactions with intervening text preserved verbatim. Match-level audit logging is unchanged — every ScanMatch still produces one audit entry, even when multiple matches collapse into a single output redaction, so reviewers tracing audit → output can account for every rule that fired.

E2E validation — full synthetic suites, real Anthropic LLM

HEALTHY_SHORT_CIRCUIT=true, mocked backends, real claude-sonnet-4-6 + claude-haiku-4-5 on the PR #780 branch.

Full EKS suite — 14 scenarios

Result	Count	Details
PASS	11	000-healthy, 001-010 (all non-`alerting`-state edge-case scenarios)
FAIL	3	011, 012, 013 — all `state: alerting`, LLM categorization drift on healthy-recovery cases (pre-existing; same failures appear on `main` / PR #777 verification)

Full RDS suite — 15 scenarios

Result	Count	Details
PASS	11	000-healthy, 001, 003-006, 008-011, 014
FAIL	4	002, 007, 012, 013 — all `state: alerting`, LLM keyword/categorization drift (pre-existing; identical failures to PR #777 verification run)

Confirmed every failure is in state: alerting (programmatic check on each failing alert.json). The guardrail engine is used for prompt redaction; the synthetic fixtures don't contain overlapping-rule secrets, so the failures above are in territory this PR does not touch.

Unit coverage — 46 tests in `tests/test_guardrails/test_engine.py`

Test area	Count
Pre-existing suite	36
New regression tests	10
- `test_contained_span_redacts_union_no_leak` (the core regression)
- `test_contained_span_uses_longest_rule_name`
- `test_partial_overlap_redacts_union`
- `test_disjoint_matches_stay_separate`
- `test_adjacent_matches_not_merged`
- `test_three_way_chain_of_overlaps_redacts_single_union`
- `test_contained_pattern_with_wider_keyword_preserves_wider_name`
- `test_real_world_api_key_and_aws_access_key_overlap` (uses the exact `_STARTER_CONFIG` patterns)
- `test_real_world_aws_secret_key_contains_aws_access_key` (uses the exact `_STARTER_CONFIG` patterns)
- `test_audit_logger_records_every_match_even_when_merged` (audit-under-merge invariant)

Existing coverage preserved, including:

test_overlapping_keyword_redaction_prefers_longest — same-start case (from fix: overlapping keyword redaction and hardcoded investigation loop limit #520)
test_overlapping_keyword_same_rule_redaction — same-rule overlap
test_multiple_rules_on_same_span — audit + redact on same span

Verification

CI (Python 3.13): test (ubuntu-latest) PASS, typecheck PASS, quality PASS, CodeQL PASS, Analyze (python) PASS
Local pytest tests/ (Python 3.14, no coverage): 2777 pass, 1 skipped, 0 failures (was 2767 on main; +10 new)
ruff check / ruff format --check: clean on touched files
mypy app/guardrails/: clean on touched module
Full EKS + RDS synthetic suites run against real Anthropic LLM on this branch: no new failures introduced vs. main

Scope

One source file (app/guardrails/engine.py, ~40 lines), one test file. No public API changes. No behavior change for non-overlapping matches, audit-only rules, or blocked rules. Directly extends #520, whose same-start case is preserved verbatim.

Security note

The bug class here is sensitive-data disclosure: anywhere guardrail rules are applied to outputs shown to users, logged, or exported, two overlapping rules covering the same secret could each partially redact and leave fragments in the wild. The shipped _STARTER_CONFIG demonstrates this concretely — api_key=AKIA... and aws_secret_access_key=... values containing an access key both produced partial redactions on main. This PR closes the gap.

…ive bookends ``GuardrailEngine.apply`` walked the sorted redact matches right-to-left with a single ``seen_end`` cursor and skipped any match whose end exceeded the cursor. When a wider match overlapped or contained an already-redacted narrower match, the wider match was silently dropped — leaving its prefix and suffix in the output. Concretely, with rules matching ``super_secret_token_value`` (wide) and ``secret_token`` (contained): input: "data super_secret_token_value end" before: "data super_[REDACTED:short]_value end" ← super_ and _value leak after: "data [REDACTED:long] end" The prior fix in Tracer-Cloud#520 only handled the same-start subcase by changing the secondary sort key; matches that started at different offsets but fully contained a narrower match still slipped through. Replace the single-cursor walk with a proper merge: 1. Sort redact matches by ``(start ASC, -end)`` so ties at the same offset keep the longest-keyword-wins behavior Tracer-Cloud#520 restored. 2. Sweep left-to-right, merging any match whose ``start`` falls before the current interval's ``end``. The representative rule for the merged interval is whichever contributing match had the largest original width, so the output reflects the most-specific rule that matched anywhere in the merged span. 3. Apply replacements right-to-left over the merged intervals so string indices stay valid as each redaction resizes the output. Adjacent matches (``A.end == B.start``) are intentionally not merged — they don't overlap. Disjoint matches produce independent redactions with intervening text preserved verbatim. Tests added in ``tests/test_guardrails/test_engine.py``: - contained-span union redaction (the regression from the bug report) - longest-rule-name semantics on a contained span - partial (non-containing) overlap produces one union redaction - disjoint matches stay separate, middle text preserved - adjacent matches stay separate (touching at one offset, not overlapping) - three-way transitive overlap chain collapses to one span - regex pattern + keyword on the same region: wider-wins applies across match kinds Existing coverage preserved: - ``test_overlapping_keyword_redaction_prefers_longest`` — same-start case - ``test_overlapping_keyword_same_rule_redaction`` — same-rule overlap - ``test_multiple_rules_on_same_span`` — audit + redact on same span Verification: - ``pytest``: 2774 pass, 1 skipped, 0 failures (was 2767 on main; +7 new) - ``ruff check`` / ``ruff format --check``: clean - ``mypy app/guardrails/``: clean on touched module - Full ``tests/test_guardrails/test_engine.py`` suite: 43 pass, 0 fail

greptile-apps · 2026-04-23T05:19:53Z

Greptile Summary

This PR fixes a span-leak bug in GuardrailEngine._redact where overlapping redaction rules (e.g. a wide keyword fully containing a narrower keyword) would silently drop the wider rule's match, leaving its prefix and suffix unredacted in the output. The fix replaces the old single-cursor right-to-left walk with a proper interval merge: sort → left-to-right sweep to merge overlapping spans (tracking the largest individual match width to select the representative rule name) → right-to-left application of replacements to preserve string indices.

Key changes:

apply delegates to a new _redact(text, matches) method, improving testability and separation of concerns.
The sort key changes from reverse=True on (start, end) to (start ASC, -end) — preserving the "widest-match-first at same start" behavior from PR fix: overlapping keyword redaction and hardcoded investigation loop limit #520 while enabling a correct forward sweep.
The seen_end cursor and skip logic are replaced with an O(n) interval merge that handles all three overlap classes: containment, partial overlap, and the transitive chain A∩B, B∩C, A⊥C.
7 new regression tests cover all relevant sub-cases; the 36 existing guardrail tests are unaffected.

Confidence Score: 5/5

Safe to merge — the interval merge algorithm is correct, well-documented, and backed by 7 targeted regression tests covering all overlap sub-cases, with zero regressions in the existing suite.

The core fix (left-to-right interval merge replacing the buggy seen_end cursor) is algorithmically correct: overlapping spans are collapsed, the widest individual match wins the representative rule name, adjacent spans remain separate, and right-to-left application preserves string indices. All edge cases (containment, partial overlap, transitive chain, disjoint, adjacent, regex+keyword mix) have dedicated tests. Code follows project style conventions. The only finding is a non-blocking P2 style suggestion to use a NamedTuple instead of a raw 4-tuple for the merged interval list.

No files require special attention.

Important Files Changed

Filename	Overview
app/guardrails/engine.py	Extracts `_redact` into a dedicated method that performs correct interval merging; the old single-cursor right-to-left walk is fully replaced. Algorithm is sound and well-documented.
tests/test_guardrails/test_engine.py	Adds 7 targeted regression tests covering contained spans, partial overlap, disjoint, adjacent, three-way chain, and pattern-vs-keyword cases. All new tests are well-structured and use the existing `_rule` helper correctly.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["apply(text)"] --> B["scan(text) → ScanResult"]
    B --> C{any matches?}
    C -- No --> D[return text unchanged]
    C -- Yes --> E[audit each match]
    E --> F{blocked?}
    F -- Yes --> G[raise GuardrailBlockedError]
    F -- No --> H["_redact(text, matches)"]
    H --> I["Filter: action == REDACT\nSort by (start ASC, -end)"]
    I --> J["Left-to-right sweep\nmerge overlapping intervals\ntrack widest individual match\nfor rule-name selection"]
    J --> K["Right-to-left application\nof replacements over\nmerged intervals"]
    K --> L[return redacted text]
    style G fill:#f66,color:#fff
    style L fill:#6c6,color:#fff

_{Reviews (1): Last reviewed commit: "fix(guardrails): redact union of overlap..." | Re-trigger Greptile}

greptile-apps · 2026-04-23T05:19:57Z

+        merged: list[tuple[int, int, str, int]] = []
        for match in redact_matches:
-            if match.end > seen_end:
-                continue
-            replacement = self._get_replacement(match.rule_name)
-            redacted = redacted[: match.start] + replacement + redacted[match.end :]
-            seen_end = match.start
+            width = match.end - match.start
+            if merged and match.start < merged[-1][1]:
+                prev_start, prev_end, prev_rule, prev_width = merged[-1]
+                new_end = max(prev_end, match.end)
+                if width > prev_width:
+                    merged[-1] = (prev_start, new_end, match.rule_name, width)
+                else:
+                    merged[-1] = (prev_start, new_end, prev_rule, prev_width)
+            else:
+                merged.append((match.start, match.end, match.rule_name, width))


Consider a NamedTuple for the merged interval for readability

The inner loop reads and writes the merged list by positional tuple index (merged[-1][1], merged[-1][3], etc.). The positions are correct, but the tuple layout (start, end, rule_name, width) is entirely implicit — a reader must mentally map each index to its meaning.

A lightweight NamedTuple would make accesses self-documenting with zero runtime overhead:

from typing import NamedTuple class _MergedInterval(NamedTuple): start: int end: int rule_name: str width: int

Then the sweep body becomes:

merged: list[_MergedInterval] = [] for match in redact_matches: width = match.end - match.start if merged and match.start < merged[-1].end: prev = merged[-1] new_end = max(prev.end, match.end) if width > prev.width: merged[-1] = _MergedInterval(prev.start, new_end, match.rule_name, width) else: merged[-1] = _MergedInterval(prev.start, new_end, prev.rule_name, prev.width) else: merged.append(_MergedInterval(match.start, match.end, match.rule_name, width))

This is a non-blocking suggestion; the current code is functionally correct.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Extend the overlap regression tests with two cases grounded in the exact patterns shipped in _STARTER_CONFIG (app/guardrails/cli.py), and with an audit-behavior assertion that demonstrates the match-level audit record remains complete even when the output-level redactions are merged: - test_real_world_api_key_and_aws_access_key_overlap: config: api_key=AKIAIOSFODNN7EXAMPLE aws_access_key pattern matches the [16:36] span; generic_api_token pattern matches the [8:36] span (contains aws_access_key). Pre-fix output leaked "api_key=". Merged output: "config: [REDACTED:generic_api_token]". - test_real_world_aws_secret_key_contains_aws_access_key: export aws_secret_access_key=AKIA<16>abcdefghijklmnopqrst aws_access_key matches [29:49]; aws_secret_key matches [7:69] and contains it. Pre-fix output leaked the label and the 40-char tail. Merged output: "export [REDACTED:aws_secret_key]". - test_audit_logger_records_every_match_even_when_merged: Confirms the AuditLogger still emits one entry per ScanMatch even when multiple matches merge into a single output redaction — a reviewer tracing audit -> output can still account for every rule that fired. Uses a real AuditLogger with a tmp_path JSONL. Verification: - pytest tests/test_guardrails/test_engine.py: 46 pass (was 43; +3) - pytest tests/: 2777 pass, 1 skipped, 0 failures - ruff check + format: clean

Closes the gap between unit-level ``GuardrailEngine.apply`` tests and the four call sites that invoke it in the pipeline: - ``app/services/llm_client.py::LLMClient.invoke`` (Anthropic messages + system) - ``app/services/llm_client.py::BedrockLLMClient.invoke`` (same shape as Anthropic) - ``app/services/llm_client.py::OpenAILLMClient.invoke`` (OpenAI chat.completions) - ``app/nodes/chat.py::_apply_guardrails_to_messages`` (LangGraph chat node) Five new parametrized integration tests load a ruleset modeled on the shipped ``_STARTER_CONFIG`` — ``aws_access_key`` is a strict substring of ``generic_api_token`` when the token value is itself an AWS key — and assert that the downstream payload (captured by a stub that replaces the real API client) contains no fragment of either the label prefix or the key value: - ``test_anthropic_client_sends_merged_redaction``: ``LLMClient.invoke("Debug dump: api_key=AKIA... from config.yml")`` is captured at ``Anthropic.messages.create`` and asserted to carry a single ``[REDACTED:generic_api_token]`` span with no ``api_key=`` or ``AKIA`` fragment. - ``test_anthropic_system_prompt_also_redacted``: Same shape, but asserts the ``system`` kwarg (distinct code path from per-message ``content``) gets the merged treatment. - ``test_openai_client_sends_merged_redaction``: Same shape for ``OpenAILLMClient.invoke`` -> ``chat.completions.create``. - ``test_chat_node_emits_merged_redaction``: ``_apply_guardrails_to_messages`` on a LangGraph-style message list, asserts original is untouched (defensive copy) and the returned message has merged redaction. - ``test_contained_real_secret_fully_redacted_in_pipeline``: End-to-end regression guard. Asserts none of ``api_key=``, ``AKIA``, ``IOSFODNN``, ``7EXAMPLE`` appear anywhere in the downstream payload. Pre-fix main leaks ``api_key=`` and the surrounding characters. Verification: - ``pytest tests/``: 2782 pass (was 2777; +5 integration) - ``pytest tests/test_guardrails/``: 59 pass (46 engine + 13 integration) - ``ruff check`` / ``ruff format --check``: clean - ``mypy app/guardrails/``: clean on touched module

greptile-apps Bot reviewed Apr 23, 2026

View reviewed changes

mayankbharati-ops added 2 commits April 23, 2026 16:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(guardrails): redact union of overlapping spans, never leak sensitive bookends#780

fix(guardrails): redact union of overlapping spans, never leak sensitive bookends#780
mayankbharati-ops wants to merge 3 commits intoTracer-Cloud:mainfrom
mayankbharati-ops:fix/guardrail-overlapping-redaction-union

mayankbharati-ops commented Apr 23, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented Apr 23, 2026

Uh oh!

greptile-apps Bot Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mayankbharati-ops commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fix

E2E validation — full synthetic suites, real Anthropic LLM

Full EKS suite — 14 scenarios

Full RDS suite — 15 scenarios

Unit coverage — 46 tests in tests/test_guardrails/test_engine.py

Verification

Scope

Security note

Uh oh!

greptile-apps Bot commented Apr 23, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mayankbharati-ops commented Apr 23, 2026 •

edited

Loading

Unit coverage — 46 tests in `tests/test_guardrails/test_engine.py`