The definitive curated list of tools, frameworks, standards, and resources for governing AI agents in production.
AI agents that call tools, write code, query databases, and execute actions need the same controls as any other system touching production infrastructure: authentication, authorisation, audit trails, cost controls, policy enforcement, and compliance evidence.
Scope: Runtime governance of AI agents — policy enforcement, audit trails, access control, cost management, compliance tooling, and agent security. Not AI safety research, alignment theory, or general responsible AI ethics.
Why now: The EU AI Act enforcement timeline is live. NIST AI RMF is production-ready. The OWASP Agentic AI Top 10 documents real attack patterns. Claude Code, Copilot, Cursor, and autonomous agent frameworks are now standard tools in enterprise software teams. Governance is no longer optional.
Contributions welcome.
- Why Governance Matters
- International Standards
- Regulatory Frameworks
- Industry Standards and Guidance
- Open-Source Governance Toolkits
- Enterprise Governance Platforms
- Claude Code and MCP Governance
- Policy Engines and Authorisation
- Audit, Observability, and Cost Control
- Security, Red-Teaming, and Threat Models
- Model and Data Governance
- Agentic Architecture Patterns
- Government and Institutional Guidance
- Learning Resources
AI agents with tool access operate with the same blast radius as a poorly-scoped IAM role. They can read files they shouldn't, call APIs they weren't meant to, run up unbounded costs, and take irreversible actions — all without a governance layer.
Prompt injection causes agents to execute attacker-controlled instructions via untrusted tool output. Excessive agency allows agents to take actions beyond their intended scope. Unbounded costs emerge when agents loop or call expensive APIs without budget controls. Audit gaps mean that when something goes wrong, there is no record of what the agent did or why. Compliance exposure under the EU AI Act, ISO 42001, and NIST AI RMF requires documented governance evidence.
A governed agent runs with least-privilege tool access, an immutable audit trail, budget enforcement, and policy checks that fire before any irreversible action.
- ISO/IEC 42001:2023 - The international standard for AI management systems. Specifies requirements for establishing, implementing, maintaining, and continually improving an AI management system within an organisation. Certifiable.
- ISO/IEC 23053 - Framework for AI systems using machine learning. Defines key concepts, components, and lifecycle stages.
- ISO/IEC 23894 - Guidance on AI risk management. Companion to ISO 42001 for operationalising risk processes.
- ISO/IEC TR 24028 - Overview of trustworthiness in AI. Covers accuracy, robustness, reliability, safety, security, and privacy.
- EU Artificial Intelligence Act - EU regulation classifying AI systems by risk tier with mandatory requirements for high-risk systems. General-purpose AI model obligations effective August 2025. Full enforcement 2026.
- NIST AI Risk Management Framework - NIST's voluntary framework for managing AI risk. Four functions: Govern, Map, Measure, Manage. Widely adopted as the US enterprise governance baseline.
- NIST AI RMF Playbook - Practical implementation guidance mapping each AI RMF subcategory to suggested actions, outcomes, and measurement approaches.
- Executive Order 14110 on Safe, Secure, and Trustworthy AI - US federal requirements for AI safety testing, red-teaming, and disclosure for frontier models.
- Blueprint for an AI Bill of Rights - White House principles for AI systems that affect Americans. Non-binding but influential on procurement requirements.
- UK AI Safety Institute - UK government body responsible for evaluating safety of advanced AI models. Publishes evaluation methodologies and results.
- OWASP Top 10 for LLM Applications - The ten most critical security risks for LLM-powered applications: prompt injection, insecure output handling, training data poisoning, model denial of service, and supply chain vulnerabilities.
- MITRE ATLAS - Adversarial Threat Landscape for AI Systems. Tactics, techniques, and real-world case studies for attacks against ML and AI systems, modelled on ATT&CK.
- MITRE ATT&CK for AI - Machine learning attack techniques mapped to the ATT&CK framework for integration with existing threat intelligence programmes.
- Cloud Security Alliance AI Safety Initiative - Enterprise guidance on AI security, governance, and trust. Includes the AI Controls Matrix and assessment tools.
- ENISA AI Threat Landscape - EU Agency for Cybersecurity reports on AI-specific threats, risk assessments, and guidelines for EU organisations.
- CISA Guidelines for Secure AI Development - US Cybersecurity and Infrastructure Security Agency guidance on secure AI system development and deployment.
- Anthropic Model Specification - Anthropic's published specification for Claude's behaviour, including operator/user trust hierarchy, corrigibility, and deference to governance layers.
- systemprompt-template - Self-hosted governance layer for Claude Code and MCP agents. Authentication, authorisation, audit trail, cost controls, and policy enforcement in a single compiled Rust binary. Source-available BSL-1.1.
- Microsoft Agent Governance Toolkit - Runtime security for AI agents across LangChain, CrewAI, AutoGen, OpenAI Agents, Semantic Kernel, and 15+ frameworks. Covers all 10 OWASP Agentic Top 10 risks with policy evaluation under 0.1ms.
- Open Policy Agent - CNCF-graduated general-purpose policy engine using the Rego language. Decouples policy from application logic; increasingly used for agent tool authorisation.
- Cedar - AWS policy language and engine for fine-grained authorisation. Formally verified semantics, expressive syntax, and high throughput for per-request agent permission decisions.
- Casbin - Access control library supporting ACL, RBAC, ABAC, and multi-tenant models. Language-agnostic with production implementations in Go, Rust, Python, Java, and Node.js.
- LiteLLM - Proxy layer for LLM API calls with per-key budgets, rate limiting, spend tracking, and model routing across all major providers.
- Guardrails AI - Input and output validation framework for LLM responses. Define schemas, validators, and automated correction actions that enforce structure and safety constraints at inference time.
- NeMo Guardrails - NVIDIA's toolkit for adding programmable guardrails to LLM-based systems via Colang configuration language.
- LlamaGuard - Meta's open-source content safety model for classifying LLM inputs and outputs against safety policies.
- Presidio - Microsoft's PII detection and anonymisation SDK. Identifies and redacts sensitive data in text before it reaches an LLM or audit log.
- Credo AI - Comprehensive AI governance platform covering risk assessment, compliance mapping (EU AI Act, NIST AI RMF, ISO 42001), model cards, and ongoing monitoring across the AI lifecycle.
- OneTrust AI Governance - Inventory, risk assessment, and compliance controls for AI systems embedded in broader data governance and privacy programmes.
- Lumenova AI - AI lifecycle governance: risk assessment, explainability monitoring, and compliance reporting focused on model transparency and regulatory evidence.
- Protect AI - MLSecOps platform covering model scanning, supply chain security, and runtime protection for AI and ML systems.
- Robust Intelligence - AI security platform for validating model robustness and detecting adversarial inputs in production.
- HiddenLayer - AI detection and response platform. Monitors AI models for adversarial attacks, data extraction attempts, and policy violations.
- Datadog LLM Observability - Production monitoring for LLM applications: latency, cost, quality scoring, and trace capture integrated with existing Datadog infrastructure.
- Patronus AI - Automated evaluation and monitoring for LLMs in production. Detects hallucinations, toxicity, PII leakage, and custom policy violations.
- systemprompt-core - The MCP governance runtime. 30-crate Rust workspace handling authentication, authorisation, rate limiting, and logging for MCP server interactions. Published on crates.io under
systemprompt-*. - awesome-claude-code-security - Curated list focused on Claude Code hardening: MCP server security, secrets scanning, prompt injection detection, and red-teaming frameworks.
- awesome-claude-code - The canonical Claude Code community list covering tooling, hooks, slash-commands, agent skills, and workflows.
- Claude Code Documentation - Anthropic's documentation on the permissions model, CLAUDE.md configuration, MCP server setup, and hook system.
- MCP Specification - The Model Context Protocol specification. Understanding the protocol is prerequisite to governing it.
- Anthropic Cookbook - Reference implementations and patterns from Anthropic including agent architectures, tool use, and safety patterns.
- OPA Rego Playground - Browser-based environment for writing and testing OPA/Rego policies without local setup.
- Cedar Policy Language - AWS-designed authorisation language with formally verified semantics. Human-readable syntax built for per-request authorisation decisions at high throughput.
- Casbin - Multi-model access control library supporting ACL, RBAC with hierarchy and domain, ABAC, and RESTful models in 10+ languages.
- HashiCorp Sentinel - Policy-as-code framework for Terraform, Vault, Consul, and Nomad. Useful for governing infrastructure provisioned by AI agents.
- AWS Verified Permissions - Managed Cedar policy service on AWS. Centralised policy storage with sub-millisecond evaluation latency for agent action authorisation.
- Ory Keto - Open-source permission server implementing Google Zanzibar's relation-based access control model for fine-grained agent tool permissions.
- LangFuse - Open-source LLM observability. Full trace capture with spans, generations, scores, and costs. Self-hostable with integrations for LangChain, LlamaIndex, OpenAI, and Anthropic SDKs.
- OpenTelemetry - CNCF standard for distributed tracing, metrics, and logs. The vendor-neutral substrate for building agent observability pipelines.
- OpenLLMetry - OpenTelemetry-based instrumentation SDK for LLM applications. Traces LLM calls with standard OTel spans and integrates with existing observability stacks.
- Helicone - Open-source LLM observability proxy. Request logging, cost tracking, caching, and rate limiting via a single proxy endpoint. Self-hostable.
- Weights and Biases Weave - Tracing and evaluation for LLM applications with strong integrations for LangChain, LlamaIndex, OpenAI, and Anthropic.
- Portkey - AI gateway with unified API for 250+ LLMs, request tracing, semantic caching, load balancing, and budget controls.
- Evidently AI - Open-source ML and LLM monitoring. Detects data and model drift, generates monitoring reports, and evaluates LLM output quality.
- WhyLabs AI Observatory - AI observability platform monitoring LLM applications for drift, data quality issues, and policy violations in production.
- AI Incident Database - Searchable database of 700+ documented AI system failures and harms in deployment. Essential for building realistic threat models and risk assessments.
- Garak - NVIDIA's LLM vulnerability scanner. Probes deployed models for prompt injection, jailbreaks, data leakage, hallucination, and toxicity.
- PyRIT - Microsoft's Python Risk Identification Toolkit for automated red-teaming of generative AI systems including multi-turn and orchestrated agent attacks.
- PromptBench - Microsoft's unified evaluation framework for adversarial robustness of LLMs. Tests models against adversarial prompts at character, word, sentence, and semantic levels.
- promptmap - Automated prompt injection testing tool. Systematically tests LLM-integrated applications for injection vulnerabilities.
- Rebuff - Prompt injection detector using multi-layer defence: heuristics, LLM-based detection, VectorDB canary tokens, and model hardening signals.
- LLM Guard - Security toolkit for LLM interactions with input and output scanners for prompt injection, PII, toxicity, and sensitive data.
- Vigil - LLM prompt injection and security scanner. Detects injection attempts, jailbreaks, and sensitive keyword patterns in real time.
- Model Cards - Google's framework for documenting AI model characteristics, performance, and limitations. De-facto standard for transparent model disclosure.
- Hugging Face Model Cards - Implementation guide and templates for model cards on the Hugging Face Hub.
- Datasheets for Datasets - Microsoft Research framework for documenting dataset provenance, composition, collection process, and recommended uses.
- DVC - Git-like versioning for ML datasets and models. Reproducible pipelines, experiment tracking, and audit trail for training data and model artifacts.
- MLflow Model Registry - Centralised model store with versioning, stage transitions, and approval workflows.
- SLSA - Supply-chain Levels for Software Artifacts applied to ML models and training pipelines. Defines four assurance levels from basic to hermetic builds.
- Sigstore - Cryptographic signing infrastructure for software artifacts. Enables verification that a model came from a trusted build process.
- Great Expectations - Data quality validation framework. Define expectations for training and inference data and alert when data drifts outside governance bounds.
- Anthropic: Building Effective Agents - Anthropic's published guidance on safe agentic systems: minimal footprint, human-in-the-loop for high-stakes actions, and preference for reversible over irreversible actions.
- awesome-agentic-patterns - Curated collection of production agent patterns including sandboxing, credential management, human-in-the-loop workflows, and multi-agent coordination.
- 12-Factor Agents - Adaptation of the 12-factor app methodology for LLM agents. Covers configuration, state management, logging, and disposability in agentic contexts.
- HumanLayer - SDK for building human-in-the-loop workflows for AI agents. Wraps tool calls with approval gates, audit trails, and escalation paths.
- Lilian Weng: LLM-Powered Autonomous Agents - Comprehensive survey of agent architectures including planning, memory, tool use, and oversight mechanisms.
- NIST AI Resource Center - Central hub for NIST AI governance resources including AI RMF, TEVV guidance, and sector-specific playbooks.
- UK NCSC: Guidelines for Secure AI System Development - Co-authored by NCSC (UK), CISA (US), ACSC (Australia), and 15 other national cybersecurity agencies. Practical security guidance across the AI development lifecycle.
- Google Secure AI Framework - Google's framework for securing AI systems with six core elements covering foundations, detection, response, and standardisation.
- OpenSSF AI/ML Security Working Group - Open Source Security Foundation working group on security for AI and ML supply chains. Produces guidance on securing training pipelines and model artifacts.
- Partnership on AI - Multi-stakeholder organisation producing research and guidance on responsible AI development and deployment practices.
- EU AI Act Compliance Checker - Interactive tool for assessing whether a specific AI system falls under EU AI Act obligations and which requirements apply.
- Coursera: AI Governance Professional Certificate - Practical AI governance programme covering risk assessment, policy development, and compliance implementation.
- State of AI Governance Report - Annual enterprise survey of AI governance programme maturity, common gaps, and implementation patterns from Credo AI.
- SANS Institute AI Security Resources - SANS training and research on AI/ML security covering adversarial attacks, model security, and secure deployment practices.
- OWASP LLM AI Security and Governance Checklist - Practical checklist for teams deploying LLM-powered systems in production.
- awesome-mcp-servers - Comprehensive directory of MCP server implementations.
- AwesomeResponsibleAI - Academic and policy resources for responsible AI covering ethics, standards, and regulatory frameworks.