Fix/restore readable investigation output after masking 479 by Ade20boss · Pull Request #807 · Tracer-Cloud/opensre

Ade20boss · 2026-04-24T01:11:52Z

Fixes #479

Describe the changes you have made in this PR -

This PR closes the remaining gaps in the masking pipeline by ensuring that planning and diagnosis prompts never expose raw infrastructure identifiers to the LLM, completing the three requirements from the issue.

1. Planning prompts now use placeholders (app/nodes/plan_actions/node.py)

Previously, build_plan_actions() received unmasked input_data directly from state. A MaskingContext is now constructed from state and applied to mask input_data fields before they are passed into the planning LLM call. This is a no-op when masking is disabled.

2. Diagnosis prompts now use placeholders (app/nodes/root_cause_diagnosis/prompt_builder.py)

The evidence dict was already masked upstream in investigate/node.py, but problem_md, hypotheses, and raw_alert were pulled directly from state and injected into the prompt unmasked. A MaskingContext is now constructed at the top of build_diagnosis_prompt and applied to these fields before they reach the prompt string.

3. Final output already restores identifiers — no change needed

Both publish_findings/node.py and root_cause_diagnosis/node.py already call masking_ctx.unmask() before any user-facing output. The prefix collision bug (<NS_1> vs <NS_10>) referenced in #639 is also already fixed in context.py via longest-first sort in unmask().

Screenshots of the UI changes (If any) -

N/A — backend masking pipeline only, no UI changes.

Code Understanding and AI Usage

Did you use AI assistance (ChatGPT, Claude, Copilot, etc.) to write any part of this code?

No, I wrote all the code myself
[] Yes, I used AI assistance (continue below)

If you used AI assistance:

[] I have reviewed every single line of the AI-generated code
[] I can explain the purpose and logic of each function/component I added
[] I have tested edge cases and understand how the code handles them
[] I have modified the AI output to follow this project's coding standards and conventions

Explain your implementation approach:

The masking system already had a solid foundation — MaskingContext handles placeholder assignment, stability across nodes, and safe unmasking. The gap was that two upstream nodes were feeding raw state fields into LLM prompts without going through the masking layer first. The fix follows the same pattern already used in investigate/node.py: construct a MaskingContext from state, apply mask() or mask_value() to the relevant fields, and pass the masked version downstream. No new abstractions were introduced, the existing API was sufficient.

Checklist before requesting a review

I have added proper PR title and linked to the issue
I have performed a self-review of my code
I can explain the purpose of every function, class, and logic block I added
I understand why my changes work and have tested them thoroughly
I have considered potential edge cases and how my code handles them
If it is a core feature, I have added thorough tests
My code follows the project's style guidelines and conventions

Note: Please check Allow edits from maintainers if you would like us to assist in the PR.

…loud#803)

greptile-apps · 2026-04-24T01:15:09Z

Greptile Summary

This PR completes the masking pipeline by ensuring plan_actions and root_cause_diagnosis prompts never expose raw infrastructure identifiers to the LLM, following the same MaskingContext pattern already used in investigate/node.py. Both previously flagged blockers (the NameError from _masking_ctx being out of scope and the missing masking_map persistence in node_plan_actions) are now resolved.

Confidence Score: 5/5

Safe to merge — all previously flagged blockers are resolved; only cosmetic P2 findings remain.

Both P1 issues from the prior review round (NameError scope bug and missing masking_map persistence) are now correctly fixed. The only remaining findings are a stray extra-whitespace lint nit and a module-level import style suggestion, neither of which affects runtime behaviour.

No files require special attention.

Important Files Changed

Filename	Overview
app/nodes/plan_actions/node.py	Constructs a MaskingContext from state and masks all InvestigateInput fields before the planning LLM call; persists masking_map back to each return branch. Previously flagged issues (NameError, missing masking_map persistence) are resolved. Minor: in-function import style.
app/nodes/root_cause_diagnosis/prompt_builder.py	Adds MaskingContext to mask problem_md, hypotheses, and raw_alert before they reach the diagnosis prompt; passes masking_ctx into _build_evidence_sections. Previously flagged NameError is fixed. One stray multi-space on line 368 was introduced by the diff.
Makefile	Adds indentation inside ifeq/ifneq blocks for readability, adds a Windows fallback when the venv is absent, and simplifies the python3-check redirect to 2>/dev/null (safe since Windows is handled earlier). No functional regressions on Linux/macOS.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[InvestigationState] --> B[node_plan_actions]
    B --> B1[MaskingContext.from_state]
    B1 --> B2[mask_value each InvestigateInput field]
    B2 --> B3[build_plan_actions LLM call masked input_data]
    B3 --> B4[persist masking_map to state]

    A --> C[build_diagnosis_prompt]
    C --> C1[MaskingContext.from_state]
    C1 --> C2[mask problem_md]
    C1 --> C3[mask hypotheses list]
    C1 --> C4[_build_evidence_sections masking_ctx passed in]
    C4 --> C5[mask raw_alert str branch]
    C2 & C3 & C5 --> C6[LLM receives placeholder-only prompt]

    B4 & C6 --> D[publish_findings / rca node masking_ctx.unmask to user output]

_{Reviews (2): Last reviewed commit: "fix: address review comments — scope mas..." | Re-trigger Greptile}

greptile-apps · 2026-04-24T01:15:13Z

    raw_alert_text: str = ""
    if isinstance(raw_alert, str):
-        raw_alert_text = raw_alert
+        raw_alert_text = _masking_ctx.mask(raw_alert)


NameError: _masking_ctx is not in scope of _build_evidence_sections

_masking_ctx is a local variable defined in build_diagnosis_prompt (line 52), but _build_evidence_sections is a module-level function, not a nested closure. Python will raise NameError: name '_masking_ctx' is not defined whenever raw_alert is a str, crashing every diagnosis prompt build for string-type alerts.

The fix is to pass the MaskingContext as an argument into _build_evidence_sections:

# In _build_evidence_sections signature: def _build_evidence_sections( state: InvestigationState, evidence: dict[str, Any], masking_ctx: "MaskingContext | None" = None, ) -> str: ... if isinstance(raw_alert, str): raw_alert_text = masking_ctx.mask(raw_alert) if masking_ctx else raw_alert

And in build_diagnosis_prompt:

evidence_text = _build_evidence_sections(state, evidence, _masking_ctx)

jlalbdalghnyalhlaly-afk

مدري

Ade20boss · 2026-04-24T01:40:42Z

@greptile-apps re-trigger

Ade20boss added 2 commits April 24, 2026 01:37

fix: add Windows fallback when .venv not yet created (closes Tracer-C…

96ff3e9

…loud#803)

Restore readable investigation output after masking

a5832e2

greptile-apps Bot reviewed Apr 24, 2026

View reviewed changes

jlalbdalghnyalhlaly-afk reviewed Apr 24, 2026

View reviewed changes

fix: address review comments — scope masking_ctx and persist masking_map

c481208

style: apply ruff formatting

d1c8964

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/restore readable investigation output after masking 479#807

Fix/restore readable investigation output after masking 479#807
Ade20boss wants to merge 4 commits intoTracer-Cloud:mainfrom
Ade20boss:fix/restore-readable-investigation-output-after-masking-479

Ade20boss commented Apr 24, 2026

Uh oh!

greptile-apps Bot commented Apr 24, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot Apr 24, 2026

Uh oh!

jlalbdalghnyalhlaly-afk Apr 24, 2026

Uh oh!

jlalbdalghnyalhlaly-afk Apr 24, 2026

Uh oh!

Uh oh!

Uh oh!

jlalbdalghnyalhlaly-afk left a comment

Uh oh!

Ade20boss commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Ade20boss commented Apr 24, 2026

Describe the changes you have made in this PR -

Screenshots of the UI changes (If any) -

Code Understanding and AI Usage

Checklist before requesting a review

Uh oh!

greptile-apps Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

jlalbdalghnyalhlaly-afk Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

jlalbdalghnyalhlaly-afk Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jlalbdalghnyalhlaly-afk left a comment

Choose a reason for hiding this comment

Uh oh!

Ade20boss commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented Apr 24, 2026 •

edited

Loading