Skip to content

AgentToolkit/toolguard-violation-analysis

Repository files navigation

Toolguard Violation Analysis

A Python toolkit for analyzing AI agent trajectories against policy constraints. The toolkit generates policy guards from policy documents, creates history-aware wrappers, and analyzes agent interactions to detect policy violations.

Citation

This repository demonstrates the methodology described in:

Near-Miss: Latent Policy Failure Detection in Agentic Workflows Ella Rabinovich, David Boaz, Naama Zwerdling, Ateret Anaby-Tavor arXiv preprint arXiv:2603.29665, 2026 https://arxiv.org/abs/2603.29665

@article{rabinovich2026nearmiss,
  title={Near-Miss: Latent Policy Failure Detection in Agentic Workflows},
  author={Rabinovich, Ella and Boaz, David and Zwerdling, Naama and Anaby-Tavor, Ateret},
  journal={arXiv preprint arXiv:2603.29665},
  year={2026}
}

Overview

Policy Violation Analysis provides three main tools that work together in a sequential workflow:

  1. Guards Generator - Generates policy guard code from policy documents and OpenAPI specifications
  2. Wrapper Generator - Creates history-aware wrappers that integrate with the generated guards
  3. Trajectory Analyzer - Analyzes AI agent trajectories against the guards and generates detailed reports

Installation

With uv (recommended):

git clone https://github.com/AgentToolkit/toolguard-violation-analysis.git
cd toolguard-violation-analysis
uv sync --extra dev

With pip:

git clone https://github.com/AgentToolkit/toolguard-violation-analysis.git
cd toolguard-violation-analysis
pip install -e ".[dev]"

Quick Start

The typical workflow involves three steps:

Step 0: Generate environmental variables

See the file .env.examples rename it to .env and update values

Step 1: Generate Policy Guards

Generate guard code from your policy document and API specification:

uv run python -m policy_violation.guards \
    --policy-path demos/tau2_airline/policy.md \
    --oas-path demos/tau2_airline/open_api.json \
    --output-dir generated \
    --app-name api

This creates guard implementations that enforce your policy constraints.

For detailed options and examples, see Guards CLI Documentation

Step 2: Generate Wrapper

Create a history-aware wrapper that uses the generated guards:

uv run python -m policy_violation.wrappers \
    --guards-dir generated \
    --wrapper-file generated/api_wrapper.py \
    --class-name "ApiWrapper"

The wrapper checks conversation history before making API calls, reducing unnecessary calls and improving efficiency.

For detailed options and examples, see Wrappers CLI Documentation

Step 3: Analyze Trajectories

Analyze AI agent trajectories against your policy guards:

uv run python -m policy_violation.analysis \
    demos/tau2_airline/traj_claude-4-sonnet.json \
    --guards-dir generated \
    --results-dir results \
    --wrapper-file generated/api_wrapper.py

This generates a CSV report with detailed analysis of policy violations and guard evaluations.

For detailed options and examples, see Analyze CLI Documentation

Complete Workflow Example

Here's a complete end-to-end example:

# 1. Generate guards from policy
uv run python -m policy_violation.guards \
    --policy-path demos/tau2_airline/policy.md \
    --oas-path demos/tau2_airline/open_api.json \
    --output-dir generated \
    --app-name api \
    --verbose

# 2. Generate wrapper
uv run python -m policy_violation.wrappers \
    --guards-dir generated \
    --wrapper-file generated/api_wrapper.py \
    --class-name "ApiWrapper" \
    --verbose

# 3. Analyze trajectories
uv run python -m policy_violation.analysis \
    demos/tau2_airline/traj_claude-4-sonnet.json \
    --guards-dir generated \
    --results-dir results \
    --wrapper-file generated/api_wrapper.py \
    --verbose

Configuration

Environment Variables

Create a .env file in the project root with your LLM configuration:

# Example
MODEL_NAME="claude-sonnet-4-6"
LLM_PROVIDER="openai"
LLM_API_KEY="..."
LLM_API_BASE="https://litellm.something.com"

See .env.example for a template.

Documentation

For detailed documentation on each tool:

  • Guards CLI - Policy guard generation options and examples
  • Wrappers CLI - Wrapper generation options and examples
  • Analyze CLI - Trajectory analysis options and examples

VS Code Debug Configurations

The project includes VS Code launch configurations for debugging. See .vscode/launch.json for:

  • Gen Guards - Debug guard generation
  • Gen Wrapper - Debug wrapper generation
  • Analyze - Debug trajectory analysis

Requirements

  • Python 3.10+
  • LLM API access (OpenAI, Anthropic, or compatible provider)
  • Policy document in Markdown format
  • OpenAPI specification in JSON format
  • Trajectory data in canonical JSON format

License

Licensed under the terms in LICENSE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages