You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Adding Korean RRN
* Make regex more specific
* Update PII to handle encoded content
* remove checked_text field
* Use length instead of locale for more consistent checking
* Handle structure content
* remove legacy label
Copy file name to clipboardExpand all lines: docs/ref/checks/pii.md
+47-8Lines changed: 47 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,26 +1,37 @@
1
1
# Contains PII
2
2
3
-
Detects personally identifiable information (PII) such as SSNs, phone numbers, credit card numbers, and email addresses using Microsoft's [Presidio library](https://microsoft.github.io/presidio/). Will automatically mask detected PII or block content based on configuration.
3
+
Detects personally identifiable information (PII) such as SSNs, phone numbers, credit card numbers, and email addresses using Guardrails' built-in TypeScript regex engine. The check can automatically mask detected spans or block the request based on configuration.
4
+
5
+
**Advanced Security Features:**
6
+
7
+
-**Unicode normalization**: Prevents bypasses using fullwidth characters (@) or zero-width spaces
8
+
-**Encoded PII detection**: Optionally detects PII hidden in Base64, URL-encoded, or hex strings
-**`entities`** (required): List of PII entity types to detect. See the full list of [supported entities](https://microsoft.github.io/presidio/supported_entities/).
27
+
-**`entities`** (required): List of PII entity types to detect. See the `PIIEntity` enum in `src/checks/pii.ts` for the full list, including custom entities such as `CVV` (credit card security codes) and `BIC_SWIFT` (bank identification codes).
20
28
-**`block`** (optional): Whether to block content or just mask PII (default: `false`)
29
+
-**`detect_encoded_pii`** (optional): If `true`, detects PII in Base64/URL-encoded/hex strings (default: `false`)
21
30
22
31
## Implementation Notes
23
32
33
+
Under the hood the TypeScript guardrail normalizes text (Unicode NFKC), strips zero-width characters, and runs curated regex patterns for each configured entity. When `detect_encoded_pii` is enabled the check also decodes Base64, URL-encoded, and hexadecimal substrings before rescanning them for matches, remapping any findings back to the original encoded content.
34
+
24
35
**Stage-specific behavior is critical:**
25
36
26
37
-**Pre-flight stage**: Use `block=false` (default) for automatic PII masking of user input
@@ -30,7 +41,7 @@ Detects personally identifiable information (PII) such as SSNs, phone numbers, c
30
41
**PII masking mode** (default, `block=false`):
31
42
32
43
- Automatically replaces detected PII with placeholder tokens like `<EMAIL_ADDRESS>`, `<US_SSN>`
33
-
- Does not trigger tripwire - allows content through with PII removed
44
+
- Does not trigger tripwire - allows content through with PII masked
34
45
35
46
**Blocking mode** (`block=true`):
36
47
@@ -41,6 +52,8 @@ Detects personally identifiable information (PII) such as SSNs, phone numbers, c
41
52
42
53
Returns a `GuardrailResult` with the following `info` dictionary:
43
54
55
+
### Basic Example (Plain PII)
56
+
44
57
```json
45
58
{
46
59
"guardrail_name": "Contains PII",
@@ -55,8 +68,34 @@ Returns a `GuardrailResult` with the following `info` dictionary:
55
68
}
56
69
```
57
70
58
-
-**`detected_entities`**: Detected entities and their values
71
+
### With Encoded PII Detection Enabled
72
+
73
+
When `detect_encoded_pii: true`, the guardrail also detects and masks encoded PII:
0 commit comments