Skip to content
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion backend/app/api/API_USAGE.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ Endpoint:
Optional filters:
- `ids=<uuid>&ids=<uuid>`
- `stage=input|output`
- `type=uli_slur_match|pii_remover|gender_assumption_bias|ban_list|llm_critic|topic_relevance`
- `type=uli_slur_match|pii_remover|gender_assumption_bias|ban_list|llm_critic|topic_relevance|llamaguard_7b|profanity_free`

Example:

Expand Down Expand Up @@ -442,6 +442,8 @@ From `validators.json`:
- `ban_list`
- `llm_critic`
- `topic_relevance`
- `llamaguard_7b`
- `profanity_free`

Source of truth:
- `backend/app/core/validators/validators.json`
Expand Down
10 changes: 10 additions & 0 deletions backend/app/api/docs/guardrails/run_guardrails.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,16 @@ Behavior notes:
- For `ban_list`, `ban_list_id` can be resolved to `banned_words` from tenant ban list configs.
- For `topic_relevance`, `topic_relevance_config_id` is required and is resolved to `configuration` + `prompt_schema_version` from tenant topic relevance configs in `guardrails.py`. Requires `OPENAI_API_KEY` to be configured; returns a validation failure with an explicit error if missing.
- For `llm_critic`, `OPENAI_API_KEY` must be configured; returns `success=false` with an explicit error if missing.
- For `llamaguard_7b`, `policies` accepts human-readable policy names (see table below). If omitted, all policies are enforced by default.

| `policies` value | Policy enforced |
|-----------------------------|----------------------------------|
| `no_violence_hate` | No violence or hate speech |
| `no_sexual_content` | No sexual content |
| `no_criminal_planning` | No criminal planning |
| `no_guns_and_illegal_weapons` | No guns or illegal weapons |
| `no_illegal_drugs` | No illegal drugs |
| `no_encourage_self_harm` | No encouragement of self-harm |
- `rephrase_needed=true` means the system could not safely auto-fix the input/output and wants the user to retry with a rephrased query.
- When `rephrase_needed=true`, `safe_text` contains the rephrase prompt shown to the user.

Expand Down
3 changes: 3 additions & 0 deletions backend/app/api/routes/guardrails.py
Original file line number Diff line number Diff line change
Expand Up @@ -258,6 +258,9 @@ def add_validator_logs(
for log in iteration.outputs.validator_logs:
result = log.validation_result

if result is None:
continue

if suppress_pass_logs and isinstance(result, PassResult):
continue

Expand Down
3 changes: 3 additions & 0 deletions backend/app/core/enum.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,6 @@ class ValidatorType(Enum):
GenderAssumptionBias = "gender_assumption_bias"
BanList = "ban_list"
TopicRelevance = "topic_relevance"
LLMCritic = "llm_critic"
LlamaGuard7B = "llamaguard_7b"
ProfanityFree = "profanity_free"
Loading
Loading