Make mask_sensitive_data honor configured score_threshold by Kymi808 · Pull Request #1967 · NVIDIA-NeMo/Guardrails

Kymi808 · 2026-06-01T19:05:56Z

Summary

mask_sensitive_data (nemoguardrails/library/sensitive_data_detection/actions.py) called _get_analyzer() with no arguments, so the analyzer was built (and @lru_cache-d) at the function's default 0.4 regardless of what the user configured in sensitive_data_detection.<source>.score_threshold (which itself defaults to 0.2 in SensitiveDataDetectionOptions).

_get_analyzer is the only place default_score_threshold is set on the underlying Presidio AnalyzerEngine, so values between the configured threshold and 0.4 were still masked even though detect_sensitive_data (which already passes the configured threshold through, see line 121) reported them as non-sensitive. Worse, because _get_analyzer is @lru_cache-d, the masking analyzer stays at the wrong threshold for the lifetime of the process — even after the user later adjusts the configuration.

-    analyzer = _get_analyzer()
+    default_score_threshold = getattr(options, "score_threshold")
+    ...
+    analyzer = _get_analyzer(score_threshold=default_score_threshold)

Fix

Mirror detect_sensitive_data: read default_score_threshold from the options and pass it through to _get_analyzer. The behavior of detect_sensitive_data is unchanged.

Test plan

Added a mock-based unit test in tests/test_sensitive_data_detection_unit.py that monkey-patches _get_analyzer (plus the optional Presidio anonymizer surface), invokes mask_sensitive_data with score_threshold: 0.85, and asserts _get_analyzer was called with score_threshold=0.85. Test lives in a separate file so it isn't caught by the existing module-level skip when the optional Presidio + spaCy stack isn't installed.

WITH this change: pytest tests/test_sensitive_data_detection_unit.py → 1 passed.
On main: the test FAILS — _get_analyzer is called with the default 0.4 instead of the configured 0.85.
ruff check and ruff format --check clean on both files (using the repo's pinned ruff==0.14.6).

Notes

All existing score_threshold tests in test_sensitive_data_detection.py use the detect sensitive data flow, so none pin the masking path's current (buggy) threshold behavior.
Commit is DCO sign-off-signed per CONTRIBUTING.md.

Summary by CodeRabbit

Bug Fixes
- Sensitive data masking now properly respects user-configured score threshold settings instead of defaulting to internal values, ensuring behavior aligns with user expectations and configuration.
Tests
- Added unit tests to verify that configured score thresholds are correctly propagated and applied during sensitive data masking operations.

`mask_sensitive_data` called `_get_analyzer()` with no arguments, so the analyzer was built (and `lru_cache`-d) at the function's default 0.4 regardless of what the user configured in `sensitive_data_detection.<source>.score_threshold` (which itself defaults to 0.2 in `SensitiveDataDetectionOptions`). `_get_analyzer` is the only place `default_score_threshold` is set on the Presidio `AnalyzerEngine`, so values between the configured threshold and 0.4 were still masked even though `detect_sensitive_data` (which already passes the configured threshold through) reported them as non-sensitive. Worse, because `_get_analyzer` is `@lru_cache`-d, the masking analyzer stays at the wrong threshold for the lifetime of the process even if the user later adjusts the configuration. Mirror `detect_sensitive_data`: fetch `default_score_threshold` from the options and pass it through. Adds a mock-based unit test that asserts the configured value reaches `_get_analyzer` (lives in a new sibling test file so it doesn't get caught by the existing module-level skip when the optional Presidio + spaCy stack isn't installed). Signed-off-by: Kymi808 <zeng.kyle13@gmail.com>

greptile-apps · 2026-06-01T19:08:02Z

Greptile Summary

This PR fixes a threshold-consistency bug in mask_sensitive_data: it now reads score_threshold from the configured SensitiveDataDetectionOptions and passes it to _get_analyzer, exactly as detect_sensitive_data already did. Without this fix, the masking path always used the hard-coded default of 0.4, so detections between the user-configured threshold and 0.4 could be masked even though the detection path reported them clean.

actions.py: Two-line change — adds default_score_threshold = getattr(options, \"score_threshold\") and passes it to _get_analyzer(score_threshold=default_score_threshold), making the masking path symmetric with the detection path.
tests/test_sensitive_data_detection.py: Adds a monkeypatched regression test asserting _get_analyzer receives the configured threshold (0.85), plus a guarded import spacy needed by setup_module; however the @pytest.mark.skipif(not SDD_SETUP_PRESENT, ...) decorator is redundant with the module-level skip and prevents the test from running without the full SDD stack despite using stubs throughout.

Confidence Score: 5/5

Safe to merge — the fix is a minimal, targeted correction that makes mask_sensitive_data consistent with detect_sensitive_data.

The production change is two lines that mirror a pattern already proven correct in the adjacent detect_sensitive_data function. The lru_cache on _get_analyzer keys on score_threshold, so different configured values get independent cached instances — no regression there. The test change adds coverage and a harmless guarded import with no risk to existing tests.

No files require special attention; the test placement issue is a minor observation with no runtime impact.

Important Files Changed

Filename	Overview
nemoguardrails/library/sensitive_data_detection/actions.py	Correct two-line fix: reads score_threshold from the configured options and forwards it to _get_analyzer, mirroring the already-correct detect_sensitive_data path.
tests/test_sensitive_data_detection.py	Adds a regression test with monkeypatching plus a guarded import spacy; the @pytest.mark.skipif on the new test is redundant with the module-level setup_module skip, and the test cannot run without SDD despite using full mocking — inconsistent with the PR description's intent.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant mask_sensitive_data
    participant _get_analyzer (lru_cache)
    participant AnalyzerEngine
    participant AnonymizerEngine

    Caller->>mask_sensitive_data: source, text, config
    mask_sensitive_data->>mask_sensitive_data: read options.score_threshold (e.g. 0.85)
    mask_sensitive_data->>_get_analyzer (lru_cache): score_threshold=0.85
    note right of _get_analyzer (lru_cache): Cache key = 0.85 — returns existing or new AnalyzerEngine
    _get_analyzer (lru_cache)-->>mask_sensitive_data: AnalyzerEngine(default_score_threshold=0.85)
    mask_sensitive_data->>AnalyzerEngine: analyze(text, entities, ad_hoc_recognizers)
    AnalyzerEngine-->>mask_sensitive_data: results (filtered at ≥0.85)
    mask_sensitive_data->>AnonymizerEngine: anonymize(text, results, operators)
    AnonymizerEngine-->>mask_sensitive_data: masked text
    mask_sensitive_data-->>Caller: masked text

Prompt To Fix All With AI

Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
tests/test_sensitive_data_detection.py:426-430
**Redundant `skipif` defeats the stated unit-test intent**

The PR description says the test was placed in a separate file so it could run without the SDD stack (monkeypatching stubs out all Presidio/spaCy calls). In practice the test landed in this file, whose `setup_module` already calls `pytest.skip("Required dependencies not found")` for the whole module when `SDD_SETUP_PRESENT` is `False`. The additional `@pytest.mark.skipif(not SDD_SETUP_PRESENT, ...)` is therefore redundant — and means the test will never execute in an environment where Presidio/spaCy are absent, even though the monkeypatching would make it fully self-contained. If the intent is a lightweight, dependency-free regression test, the test should live in a separate file (as originally described) and the `skipif` should be dropped.

_{Reviews (3): Last reviewed commit: "review: consolidate sensitive-data test,..." | Re-trigger Greptile}

coderabbitai · 2026-06-01T19:09:05Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d6a3814d-fcb4-4aee-b349-523c57a9f53c

📥 Commits

Reviewing files that changed from the base of the PR and between 8082e74 and 5de943c.

📒 Files selected for processing (2)

nemoguardrails/library/sensitive_data_detection/actions.py
tests/test_sensitive_data_detection_unit.py

📝 Walkthrough

Walkthrough

This PR fixes a threshold mismatch in sensitive data detection masking. The mask_sensitive_data function now reads and uses the configured score_threshold from the sensitive data detection options, aligning it with the detection path behavior. A new unit test validates this configuration is correctly propagated.

Changes

Threshold Propagation Fix

Layer / File(s)	Summary
Threshold configuration in mask_sensitive_data `nemoguardrails/library/sensitive_data_detection/actions.py`	`mask_sensitive_data` reads `score_threshold` from per-source `SensitiveDataDetectionOptions` and passes it to `_get_analyzer`, ensuring masking uses the same threshold as detection instead of the analyzer's default.
Unit test for threshold propagation `tests/test_sensitive_data_detection_unit.py`	New test monkeypatches `_get_analyzer` to capture the threshold argument, stubs anonymization components, and asserts that the configured `score_threshold: 0.85` is correctly propagated to the analyzer.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and concisely describes the main change: ensuring mask_sensitive_data respects the configured score_threshold setting.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Test Results For Major Changes	✅ Passed	Bug fix with test coverage documented in PR description. Test passes with fix, fails without it; no regression risk for a focused configuration-alignment fix.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Address the docstring-coverage warning on PR NVIDIA-NeMo#1967 — the stub classes, their methods, the fake `_get_analyzer`, and the inner result container in the regression test had no docstrings. Concise one-line docstrings now describe each stand-in. No behavior change. Signed-off-by: Kymi808 <zeng.kyle13@gmail.com>

christinaexyou · 2026-06-01T23:12:42Z

@@ -0,0 +1,92 @@
+# SPDX-FileCopyrightText: Copyright (c) 2023-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.


can you consolidate all your tests in tests/tests/test_sensitive_data_detection.py ?

i think having a test file solely for mask_sensitive_data adds bloat to the codebase and can lead to potential confusion. i suggest after updating tests/tests/test_sensitive_data_detection.py, you test your unit tests locally with SDD_SETUP_PRESENT dependencies installed

christinaexyou · 2026-06-01T23:32:58Z

        return text

-    analyzer = _get_analyzer()
+    # Honor the configured score_threshold, mirroring detect_sensitive_data;


i don't think that these comments are necessary esp. since they point out an issue that won't exist anymore once your code is merged. it's clear from the code that you are passing in a score_threshold to the function

@christinaexyou

Address @christinaexyou's review on PR NVIDIA-NeMo#1967: - Move `test_mask_sensitive_data_honors_configured_score_threshold` from the standalone `tests/test_sensitive_data_detection_unit.py` into the existing `tests/test_sensitive_data_detection.py` (with the same `skipif` + `unit` + `asyncio` decorator stack used by the other tests). Delete the now-redundant unit file. - Drop the four-line explanatory comment above the fixed `_get_analyzer` call; the code is self-explanatory once the bug is gone and the reasoning lives in the commit message. Also add a `if SDD_SETUP_PRESENT: import spacy` shim so the module's `setup_module` (which references `spacy.util.is_package` without ever importing `spacy`) actually runs when the SDD extras are installed. Previously every test in the file was silently skipped via the bare `except Exception` fallthrough with "Unexpected error during setup: name 'spacy' is not defined" — making local verification impossible. Verified locally with `pip install presidio-analyzer presidio-anonymizer spacy && python -m spacy download en_core_web_lg`: the regression test PASSES with the fix and FAILS (AssertionError) on a manually reverted `actions.py`. Signed-off-by: Kymi808 <zeng.kyle13@gmail.com>

Kymi808 · 2026-06-01T23:43:59Z

Thanks for the review @christinaexyou — both points addressed in c2b68bb:

test_mask_sensitive_data_honors_configured_score_threshold is now in tests/test_sensitive_data_detection.py with the same skipif(not SDD_SETUP_PRESENT) + unit + asyncio decorator stack the other tests use, and tests/test_sensitive_data_detection_unit.py is deleted.
Dropped the four-line explanatory comment above the fixed _get_analyzer call.

One small extra: while verifying locally as you suggested, every test in test_sensitive_data_detection.py was getting silently skipped with Unexpected error during setup: name 'spacy' is not defined — setup_module calls spacy.util.is_package(...) but spacy is never imported and the bare except Exception was swallowing the NameError. Added a one-line if SDD_SETUP_PRESENT: import spacy shim above setup_module so the file actually runs when the SDD extras are installed. Happy to split that out into a separate PR if you'd prefer.

Verified with pip install presidio-analyzer presidio-anonymizer spacy && python -m spacy download en_core_web_lg: the regression test PASSES with the fix and FAILS (AssertionError: assert 0.4 == 0.85) on a manually reverted actions.py.

codecov · 2026-06-01T23:50:45Z

Codecov Report

❌ Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...drails/library/sensitive_data_detection/actions.py	0.00%	2 Missing ⚠️

📢 Thoughts on this report? Let us know!

Kymi808 · 2026-06-02T00:09:05Z

Quick note on the codecov 0% patch report: it reflects the existing CI tier pattern, not a regression introduced by this PR.

All 8 tests in test_sensitive_data_detection.py (the 7 pre-existing ones plus the consolidated regression test) are skipped in pr-tests-matrix because presidio_analyzer / presidio_anonymizer / spacy aren't installed in that tier, so SDD_SETUP_PRESENT is False and setup_module calls pytest.skip. From the run-26788987717 log:

test_masking_input_output                                     SKIPPED
test_detection_input_output                                   SKIPPED
test_masking_retrieval                                        SKIPPED
test_score_threshold                                          SKIPPED
test_invalid_score_threshold                                  SKIPPED
test_invalid_score_threshold_chat_message                     SKIPPED
test_high_score_threshold_disables_rails                      SKIPPED
test_mask_sensitive_data_honors_configured_score_threshold    SKIPPED  ← this PR

The codecov/patch status check is passing — the bot comment is informational. Per your suggestion I verified locally with the SDD extras installed (presidio-analyzer, presidio-anonymizer, spacy, en_core_web_lg): the regression test passes with the fix and fails (AssertionError: assert 0.4 == 0.85) on a manually reverted actions.py.

Happy to split out the setup_module cleanup or refactor it into per-test skipif markers (so the tests actually run in the default CI tier and codecov can measure them) as a separate PR if that would be useful — kept it out of this one to stay in scope.

christinaexyou · 2026-06-02T15:48:07Z

        pytest.skip("Required dependencies not found")

    try:
        # check if the model is already downloaded


can you add "import spacy" here and remove lines 36-38 ? otherwise, we check whether if SDD_SETUP_PRESENT twice

christinaexyou · 2026-06-02T16:02:56Z

    chat << "Hi! My name is John as well."
+
+
+@pytest.mark.skipif(not SDD_SETUP_PRESENT, reason="Sensitive Data Detection setup is not present.")


can you refactor your test to match the formatting of the existing ones ? i would use test_high_score_threshold_disables_rails() as a reference and create an example config where the score_threshold is 1.0. we know that a score_threshold of 1.0 would NOT mask anything so if the user message with a PERSON entity is unaltered then we know that mask_sensitive_data honors a configured score threshold

christinaexyou · 2026-06-02T16:05:40Z

+@pytest.mark.unit
+@pytest.mark.asyncio
+async def test_mask_sensitive_data_honors_configured_score_threshold(monkeypatch):
+    """Regression: ``mask_sensitive_data`` must honor ``options.score_threshold``.


also, no need for this comment per my previous feedback

christinaexyou reviewed Jun 1, 2026

View reviewed changes

christinaexyou reviewed Jun 2, 2026

View reviewed changes

		@@ -0,0 +1,92 @@
		# SPDX-FileCopyrightText: Copyright (c) 2023-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

		chat << "Hi! My name is John as well."


		@pytest.mark.skipif(not SDD_SETUP_PRESENT, reason="Sensitive Data Detection setup is not present.")

Conversation

Kymi808 commented Jun 1, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Fix

Test plan

Notes

Summary by CodeRabbit

Uh oh!

greptile-apps Bot commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

coderabbitai Bot commented Jun 1, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

christinaexyou Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

christinaexyou Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

Kymi808 commented Jun 1, 2026

Uh oh!

codecov Bot commented Jun 1, 2026

Codecov Report

Uh oh!

Kymi808 commented Jun 2, 2026

Uh oh!

christinaexyou Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

christinaexyou Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

christinaexyou Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Kymi808 commented Jun 1, 2026 •

edited by coderabbitai Bot

Loading

greptile-apps Bot commented Jun 1, 2026 •

edited

Loading