fix(e2e): enable LoRA PII detection for aibrix profile

yossiovadia · yossiovadia · commit 8034e8abf5c9 · 2025-11-20T11:55:53.000-08:00
This commit fixes PII detection test failures for the aibrix profile by switching
from the non-functional ModernBERT PII model to LoRA-based auto-detection, matching
the configuration already proven to work in dynamic-config and ai-gateway profiles.

## Problem

The aibrix profile PII detection test was failing with 0% accuracy (0/100 tests passed).
All 100 PII test requests were passing through without being blocked, even though they
contained sensitive data like credit cards, SSNs, emails, and IP addresses.

## Root Causes

**Same issues as ai-gateway had before the previous fix**:

1. **Using outdated ModernBERT PII model**:
   - Profile was using `models/pii_classifier_modernbert-base_presidio_token_model`
   - ModernBERT classifier initialized but never detected any PII entities
   - No "Detected PII" or "PII token classification" logs during test runs
   - Result: 0% detection accuracy

2. **Missing default_decision fallback**:
   - No catch-all decision for PII policy fallback mechanism
   - PII policy code (src/semantic-router/pkg/utils/pii/policy.go) falls back to "default_decision"
   - Without it, edge cases with empty decision names would disable PII detection

3. **No profile-specific model configuration**:
   - Test was using inherited E2E_TEST_MODEL from previous test runs
   - Not explicitly configured for aibrix's model: vllm-llama3-8b-instruct
   - Could cause inconsistent behavior across test runs

## Solution

### 1. Switch AIBrix to LoRA PII Detection (deploy/kubernetes/aibrix/semantic-router-values/values.yaml)

**Change 1**: Updated pii_model configuration (lines 459-466)
```yaml
pii_model:
  # Support both traditional (modernbert) and LoRA-based PII detection
  # When model_type is "auto", the system will auto-detect LoRA configuration
  model_id: "models/lora_pii_detector_bert-base-uncased_model"
  model_type: "auto"  # Enables LoRA auto-detection
  threshold: 0.7
  use_cpu: true
  pii_mapping_path: "models/lora_pii_detector_bert-base-uncased_model/pii_type_mapping.json"
```

**Why**:
- ModernBERT PII model was not detecting any PII (0% accuracy in tests)
- LoRA PII model proven to work in both dynamic-config and ai-gateway (100% accuracy)
- `model_type: "auto"` enables automatic LoRA model detection
- Uses same battle-tested model across all profiles for consistency
- Aligns aibrix configuration with the working profiles

**Change 2**: Added default_decision for fallback (lines 386-401)
```yaml
- name: default_decision
  description: "Default catch-all decision - blocks all PII for safety"
  priority: 0
  rules:
    operator: "OR"
    conditions:
      - type: "domain"
        name: "other"
  modelRefs:
    - model: vllm-llama3-8b-instruct
  plugins:
    - type: "pii"
      configuration:
        enabled: true
        pii_types_allowed: []
```

**Why**:
- PII policy code falls back to "default_decision" when decision lookup fails
- Priority 0 ensures it's only used as last resort
- Blocks all PII types for maximum safety
- Prevents edge cases where PII detection would be disabled
- Required by fallback mechanism in src/semantic-router/pkg/utils/pii/policy.go

### 2. Configure AIBrix Profile Test Model (e2e/profiles/aibrix/profile.go)

**Change**: Set environment variable in Setup() method (lines 84-85)
```go
// Configure PII test to use vllm-llama3-8b-instruct model
os.Setenv("E2E_TEST_MODEL", deploymentDemoLLM)
```
where `deploymentDemoLLM = "vllm-llama3-8b-instruct"`

**Why**:
- AIBrix uses different model names than dynamic-config/ai-gateway
- Ensures test explicitly uses the correct aibrix model
- Prevents reliance on environment variable inheritance from other tests
- Matches the approach used in dynamic-config profile (sets E2E_TEST_MODEL=MoM)
- Makes test behavior predictable and independent

## Testing

**Before Fix**:
- aibrix: 0/100 PII tests passed (0% accuracy) ❌

**After Fix**:
- aibrix: 100/100 PII tests passed (100% accuracy) ✅

**Test Command**:
```bash
make e2e-cleanup &amp;&amp; make e2e-test E2E_PROFILE=aibrix E2E_VERBOSE=true E2E_KEEP_CLUSTER=true
```

**Verified**: No impact on other profiles (dynamic-config and ai-gateway) as changes are isolated to aibrix-specific files only.

## Files Changed

1. **deploy/kubernetes/aibrix/semantic-router-values/values.yaml** (Lines 386-401, 459-466)
   - Added default_decision for PII policy fallback
   - Switched pii_model from ModernBERT to LoRA auto-detection
   - Aligned with dynamic-config and ai-gateway working configuration

2. **e2e/profiles/aibrix/profile.go** (Lines 84-85)
   - Sets E2E_TEST_MODEL=vllm-llama3-8b-instruct in Setup()
   - Ensures profile-specific model configuration
   - Makes test behavior independent and predictable

## Why This Works

**AIBrix flow**:
1. Test uses model="vllm-llama3-8b-instruct" (via E2E_TEST_MODEL env var)
2. Routes to decision (either matched or falls back to default_decision)
3. Decision has PII plugin enabled → PII detection runs
4. LoRA PII classifier detects entities (credit cards, SSNs, emails, etc.)
5. Policy blocks request → 100% accuracy ✅

All three profiles (dynamic-config, ai-gateway, aibrix) now use the same proven
LoRA PII detection model with 100% accuracy across all E2E tests.

## Summary of All Profiles

| Profile | PII Detection | Configuration |
|---------|--------------|---------------|
| dynamic-config | 100/100 (100%) ✅ | LoRA auto-detection, model=MoM |
| ai-gateway | 100/100 (100%) ✅ | LoRA auto-detection, model=general-expert |
| aibrix | 100/100 (100%) ✅ | LoRA auto-detection, model=vllm-llama3-8b-instruct |

Signed-off-by: Yossi Ovadia &lt;yovadia@redhat.com&gt;
diff --git a/deploy/kubernetes/aibrix/semantic-router-values/values.yaml b/deploy/kubernetes/aibrix/semantic-router-values/values.yaml
@@ -380,6 +380,26 @@ config:
             system_prompt: "You are a thinking expert, should think multiple steps before answering. Please answer the question step by step."
             mode: "replace"
 
+    # Default catch-all decision for unmatched requests (E2E PII test fix)
+    # This ensures PII detection is always enabled via policy.go fallback mechanism
+    # When no decision matches, CheckPolicy and IsPIIEnabled fall back to this decision
+    - name: default_decision
+      description: "Default catch-all decision - blocks all PII for safety"
+      priority: 0
+      rules:
+        operator: "OR"
+        conditions:
+          - type: "domain"
+            name: "other"
+      modelRefs:
+        - model: vllm-llama3-8b-instruct
+          use_reasoning: false
+      plugins:
+        - type: "pii"
+          configuration:
+            enabled: true
+            pii_types_allowed: []
+
   # Strategy for selecting between multiple matching decisions
   # Options: "priority" (use decision with highest priority) or "confidence" (use decision with highest confidence)
   strategy: "priority"
@@ -437,11 +457,13 @@ config:
       use_cpu: true
       category_mapping_path: "models/category_classifier_modernbert-base_model/category_mapping.json"
     pii_model:
-      model_id: "models/pii_classifier_modernbert-base_presidio_token_model"
-      use_modernbert: true
+      # Support both traditional (modernbert) and LoRA-based PII detection
+      # When model_type is "auto", the system will auto-detect LoRA configuration
+      model_id: "models/lora_pii_detector_bert-base-uncased_model"
+      model_type: "auto"  # Enables LoRA auto-detection
       threshold: 0.7
       use_cpu: true
-      pii_mapping_path: "models/pii_classifier_modernbert-base_presidio_token_model/pii_type_mapping.json"
+      pii_mapping_path: "models/lora_pii_detector_bert-base-uncased_model/pii_type_mapping.json"
 
   keyword_rules:
     - name: "thinking"
diff --git a/e2e/profiles/aibrix/profile.go b/e2e/profiles/aibrix/profile.go
@@ -81,6 +81,9 @@ func (p *Profile) Setup(ctx context.Context, opts *framework.SetupOptions) error
 	p.verbose = opts.Verbose
 	p.log("Setting up AIBrix test environment")
 
+	// Configure PII test to use vllm-llama3-8b-instruct model
+	os.Setenv("E2E_TEST_MODEL", deploymentDemoLLM)
+
 	deployer := helm.NewDeployer(opts.KubeConfig, opts.Verbose)
 
 	// Track what we've deployed for cleanup on error