Skip to content

Make mask_sensitive_data honor configured score_threshold#1967

Open
Kymi808 wants to merge 3 commits into
NVIDIA-NeMo:developfrom
Kymi808:fix/mask-sensitive-data-honors-score-threshold
Open

Make mask_sensitive_data honor configured score_threshold#1967
Kymi808 wants to merge 3 commits into
NVIDIA-NeMo:developfrom
Kymi808:fix/mask-sensitive-data-honors-score-threshold

Conversation

@Kymi808

@Kymi808 Kymi808 commented Jun 1, 2026

Copy link
Copy Markdown

Summary

mask_sensitive_data (nemoguardrails/library/sensitive_data_detection/actions.py) called _get_analyzer() with no arguments, so the analyzer was built (and @lru_cache-d) at the function's default 0.4 regardless of what the user configured in sensitive_data_detection.<source>.score_threshold (which itself defaults to 0.2 in SensitiveDataDetectionOptions).

_get_analyzer is the only place default_score_threshold is set on the underlying Presidio AnalyzerEngine, so values between the configured threshold and 0.4 were still masked even though detect_sensitive_data (which already passes the configured threshold through, see line 121) reported them as non-sensitive. Worse, because _get_analyzer is @lru_cache-d, the masking analyzer stays at the wrong threshold for the lifetime of the process — even after the user later adjusts the configuration.

-    analyzer = _get_analyzer()
+    default_score_threshold = getattr(options, "score_threshold")
+    ...
+    analyzer = _get_analyzer(score_threshold=default_score_threshold)

Fix

Mirror detect_sensitive_data: read default_score_threshold from the options and pass it through to _get_analyzer. The behavior of detect_sensitive_data is unchanged.

Test plan

Added a mock-based unit test in tests/test_sensitive_data_detection_unit.py that monkey-patches _get_analyzer (plus the optional Presidio anonymizer surface), invokes mask_sensitive_data with score_threshold: 0.85, and asserts _get_analyzer was called with score_threshold=0.85. Test lives in a separate file so it isn't caught by the existing module-level skip when the optional Presidio + spaCy stack isn't installed.

  • WITH this change: pytest tests/test_sensitive_data_detection_unit.py → 1 passed.
  • On main: the test FAILS — _get_analyzer is called with the default 0.4 instead of the configured 0.85.
  • ruff check and ruff format --check clean on both files (using the repo's pinned ruff==0.14.6).

Notes

  • All existing score_threshold tests in test_sensitive_data_detection.py use the detect sensitive data flow, so none pin the masking path's current (buggy) threshold behavior.
  • Commit is DCO sign-off-signed per CONTRIBUTING.md.

Summary by CodeRabbit

  • Bug Fixes

    • Sensitive data masking now properly respects user-configured score threshold settings instead of defaulting to internal values, ensuring behavior aligns with user expectations and configuration.
  • Tests

    • Added unit tests to verify that configured score thresholds are correctly propagated and applied during sensitive data masking operations.

`mask_sensitive_data` called `_get_analyzer()` with no arguments, so the
analyzer was built (and `lru_cache`-d) at the function's default 0.4
regardless of what the user configured in
`sensitive_data_detection.<source>.score_threshold` (which itself defaults to
0.2 in `SensitiveDataDetectionOptions`).

`_get_analyzer` is the only place `default_score_threshold` is set on the
Presidio `AnalyzerEngine`, so values between the configured threshold and 0.4
were still masked even though `detect_sensitive_data` (which already passes
the configured threshold through) reported them as non-sensitive. Worse,
because `_get_analyzer` is `@lru_cache`-d, the masking analyzer stays at the
wrong threshold for the lifetime of the process even if the user later
adjusts the configuration.

Mirror `detect_sensitive_data`: fetch `default_score_threshold` from the
options and pass it through. Adds a mock-based unit test that asserts the
configured value reaches `_get_analyzer` (lives in a new sibling test file so
it doesn't get caught by the existing module-level skip when the optional
Presidio + spaCy stack isn't installed).

Signed-off-by: Kymi808 <zeng.kyle13@gmail.com>
@greptile-apps

greptile-apps Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes a threshold-consistency bug in mask_sensitive_data: it now reads score_threshold from the configured SensitiveDataDetectionOptions and passes it to _get_analyzer, exactly as detect_sensitive_data already did. Without this fix, the masking path always used the hard-coded default of 0.4, so detections between the user-configured threshold and 0.4 could be masked even though the detection path reported them clean.

  • actions.py: Two-line change — adds default_score_threshold = getattr(options, \"score_threshold\") and passes it to _get_analyzer(score_threshold=default_score_threshold), making the masking path symmetric with the detection path.
  • tests/test_sensitive_data_detection.py: Adds a monkeypatched regression test asserting _get_analyzer receives the configured threshold (0.85), plus a guarded import spacy needed by setup_module; however the @pytest.mark.skipif(not SDD_SETUP_PRESENT, ...) decorator is redundant with the module-level skip and prevents the test from running without the full SDD stack despite using stubs throughout.

Confidence Score: 5/5

Safe to merge — the fix is a minimal, targeted correction that makes mask_sensitive_data consistent with detect_sensitive_data.

The production change is two lines that mirror a pattern already proven correct in the adjacent detect_sensitive_data function. The lru_cache on _get_analyzer keys on score_threshold, so different configured values get independent cached instances — no regression there. The test change adds coverage and a harmless guarded import with no risk to existing tests.

No files require special attention; the test placement issue is a minor observation with no runtime impact.

Important Files Changed

Filename Overview
nemoguardrails/library/sensitive_data_detection/actions.py Correct two-line fix: reads score_threshold from the configured options and forwards it to _get_analyzer, mirroring the already-correct detect_sensitive_data path.
tests/test_sensitive_data_detection.py Adds a regression test with monkeypatching plus a guarded import spacy; the @pytest.mark.skipif on the new test is redundant with the module-level setup_module skip, and the test cannot run without SDD despite using full mocking — inconsistent with the PR description's intent.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant mask_sensitive_data
    participant _get_analyzer (lru_cache)
    participant AnalyzerEngine
    participant AnonymizerEngine

    Caller->>mask_sensitive_data: source, text, config
    mask_sensitive_data->>mask_sensitive_data: read options.score_threshold (e.g. 0.85)
    mask_sensitive_data->>_get_analyzer (lru_cache): score_threshold=0.85
    note right of _get_analyzer (lru_cache): Cache key = 0.85 — returns existing or new AnalyzerEngine
    _get_analyzer (lru_cache)-->>mask_sensitive_data: AnalyzerEngine(default_score_threshold=0.85)
    mask_sensitive_data->>AnalyzerEngine: analyze(text, entities, ad_hoc_recognizers)
    AnalyzerEngine-->>mask_sensitive_data: results (filtered at ≥0.85)
    mask_sensitive_data->>AnonymizerEngine: anonymize(text, results, operators)
    AnonymizerEngine-->>mask_sensitive_data: masked text
    mask_sensitive_data-->>Caller: masked text
Loading
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
tests/test_sensitive_data_detection.py:426-430
**Redundant `skipif` defeats the stated unit-test intent**

The PR description says the test was placed in a separate file so it could run without the SDD stack (monkeypatching stubs out all Presidio/spaCy calls). In practice the test landed in this file, whose `setup_module` already calls `pytest.skip("Required dependencies not found")` for the whole module when `SDD_SETUP_PRESENT` is `False`. The additional `@pytest.mark.skipif(not SDD_SETUP_PRESENT, ...)` is therefore redundant — and means the test will never execute in an environment where Presidio/spaCy are absent, even though the monkeypatching would make it fully self-contained. If the intent is a lightweight, dependency-free regression test, the test should live in a separate file (as originally described) and the `skipif` should be dropped.

Reviews (3): Last reviewed commit: "review: consolidate sensitive-data test,..." | Re-trigger Greptile

@coderabbitai

coderabbitai Bot commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d6a3814d-fcb4-4aee-b349-523c57a9f53c

📥 Commits

Reviewing files that changed from the base of the PR and between 8082e74 and 5de943c.

📒 Files selected for processing (2)
  • nemoguardrails/library/sensitive_data_detection/actions.py
  • tests/test_sensitive_data_detection_unit.py

📝 Walkthrough

Walkthrough

This PR fixes a threshold mismatch in sensitive data detection masking. The mask_sensitive_data function now reads and uses the configured score_threshold from the sensitive data detection options, aligning it with the detection path behavior. A new unit test validates this configuration is correctly propagated.

Changes

Threshold Propagation Fix

Layer / File(s) Summary
Threshold configuration in mask_sensitive_data
nemoguardrails/library/sensitive_data_detection/actions.py
mask_sensitive_data reads score_threshold from per-source SensitiveDataDetectionOptions and passes it to _get_analyzer, ensuring masking uses the same threshold as detection instead of the analyzer's default.
Unit test for threshold propagation
tests/test_sensitive_data_detection_unit.py
New test monkeypatches _get_analyzer to capture the threshold argument, stubs anonymization components, and asserts that the configured score_threshold: 0.85 is correctly propagated to the analyzer.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main change: ensuring mask_sensitive_data respects the configured score_threshold setting.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Test Results For Major Changes ✅ Passed Bug fix with test coverage documented in PR description. Test passes with fix, fails without it; no regression risk for a focused configuration-alignment fix.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Address the docstring-coverage warning on PR NVIDIA-NeMo#1967 — the stub classes,
their methods, the fake `_get_analyzer`, and the inner result container in
the regression test had no docstrings. Concise one-line docstrings now
describe each stand-in. No behavior change.

Signed-off-by: Kymi808 <zeng.kyle13@gmail.com>
@@ -0,0 +1,92 @@
# SPDX-FileCopyrightText: Copyright (c) 2023-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you consolidate all your tests in tests/tests/test_sensitive_data_detection.py ?

i think having a test file solely for mask_sensitive_data adds bloat to the codebase and can lead to potential confusion. i suggest after updating tests/tests/test_sensitive_data_detection.py, you test your unit tests locally with SDD_SETUP_PRESENT dependencies installed

return text

analyzer = _get_analyzer()
# Honor the configured score_threshold, mirroring detect_sensitive_data;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think that these comments are necessary esp. since they point out an issue that won't exist anymore once your code is merged. it's clear from the code that you are passing in a score_threshold to the function

Address @christinaexyou's review on PR NVIDIA-NeMo#1967:

- Move `test_mask_sensitive_data_honors_configured_score_threshold` from the
  standalone `tests/test_sensitive_data_detection_unit.py` into the existing
  `tests/test_sensitive_data_detection.py` (with the same `skipif` +
  `unit` + `asyncio` decorator stack used by the other tests). Delete the
  now-redundant unit file.

- Drop the four-line explanatory comment above the fixed `_get_analyzer`
  call; the code is self-explanatory once the bug is gone and the reasoning
  lives in the commit message.

Also add a `if SDD_SETUP_PRESENT: import spacy` shim so the module's
`setup_module` (which references `spacy.util.is_package` without ever
importing `spacy`) actually runs when the SDD extras are installed.
Previously every test in the file was silently skipped via the bare
`except Exception` fallthrough with "Unexpected error during setup: name
'spacy' is not defined" — making local verification impossible.

Verified locally with `pip install presidio-analyzer presidio-anonymizer
spacy && python -m spacy download en_core_web_lg`: the regression test
PASSES with the fix and FAILS (AssertionError) on a manually reverted
`actions.py`.

Signed-off-by: Kymi808 <zeng.kyle13@gmail.com>
@Kymi808

Kymi808 commented Jun 1, 2026

Copy link
Copy Markdown
Author

Thanks for the review @christinaexyou — both points addressed in c2b68bb:

  • test_mask_sensitive_data_honors_configured_score_threshold is now in tests/test_sensitive_data_detection.py with the same skipif(not SDD_SETUP_PRESENT) + unit + asyncio decorator stack the other tests use, and tests/test_sensitive_data_detection_unit.py is deleted.
  • Dropped the four-line explanatory comment above the fixed _get_analyzer call.

One small extra: while verifying locally as you suggested, every test in test_sensitive_data_detection.py was getting silently skipped with Unexpected error during setup: name 'spacy' is not definedsetup_module calls spacy.util.is_package(...) but spacy is never imported and the bare except Exception was swallowing the NameError. Added a one-line if SDD_SETUP_PRESENT: import spacy shim above setup_module so the file actually runs when the SDD extras are installed. Happy to split that out into a separate PR if you'd prefer.

Verified with pip install presidio-analyzer presidio-anonymizer spacy && python -m spacy download en_core_web_lg: the regression test PASSES with the fix and FAILS (AssertionError: assert 0.4 == 0.85) on a manually reverted actions.py.

@codecov

codecov Bot commented Jun 1, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...drails/library/sensitive_data_detection/actions.py 0.00% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

@Kymi808

Kymi808 commented Jun 2, 2026

Copy link
Copy Markdown
Author

Quick note on the codecov 0% patch report: it reflects the existing CI tier pattern, not a regression introduced by this PR.

All 8 tests in test_sensitive_data_detection.py (the 7 pre-existing ones plus the consolidated regression test) are skipped in pr-tests-matrix because presidio_analyzer / presidio_anonymizer / spacy aren't installed in that tier, so SDD_SETUP_PRESENT is False and setup_module calls pytest.skip. From the run-26788987717 log:

test_masking_input_output                                     SKIPPED
test_detection_input_output                                   SKIPPED
test_masking_retrieval                                        SKIPPED
test_score_threshold                                          SKIPPED
test_invalid_score_threshold                                  SKIPPED
test_invalid_score_threshold_chat_message                     SKIPPED
test_high_score_threshold_disables_rails                      SKIPPED
test_mask_sensitive_data_honors_configured_score_threshold    SKIPPED  ← this PR

The codecov/patch status check is passing — the bot comment is informational. Per your suggestion I verified locally with the SDD extras installed (presidio-analyzer, presidio-anonymizer, spacy, en_core_web_lg): the regression test passes with the fix and fails (AssertionError: assert 0.4 == 0.85) on a manually reverted actions.py.

Happy to split out the setup_module cleanup or refactor it into per-test skipif markers (so the tests actually run in the default CI tier and codecov can measure them) as a separate PR if that would be useful — kept it out of this one to stay in scope.

pytest.skip("Required dependencies not found")

try:
# check if the model is already downloaded

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add "import spacy" here and remove lines 36-38 ? otherwise, we check whether if SDD_SETUP_PRESENT twice

chat << "Hi! My name is John as well."


@pytest.mark.skipif(not SDD_SETUP_PRESENT, reason="Sensitive Data Detection setup is not present.")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you refactor your test to match the formatting of the existing ones ? i would use test_high_score_threshold_disables_rails() as a reference and create an example config where the score_threshold is 1.0. we know that a score_threshold of 1.0 would NOT mask anything so if the user message with a PERSON entity is unaltered then we know that mask_sensitive_data honors a configured score threshold

@pytest.mark.unit
@pytest.mark.asyncio
async def test_mask_sensitive_data_honors_configured_score_threshold(monkeypatch):
"""Regression: ``mask_sensitive_data`` must honor ``options.score_threshold``.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, no need for this comment per my previous feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants