Introduce support for agent skills #3780

DougTrajano · 2025-12-19T21:49:08Z

Closes Support Skills with any model #3781

This pull request introduces a new "Skills" system to Pydantic AI, enabling modular, progressive skill discovery and execution for agents. The main changes include the addition of the SkillsToolset and related types, updates to documentation and examples to demonstrate skill usage, and new skill definitions and scripts for practical use.

Warning! The pyYAML dependency was added to pydantic-ai-slim because Agent Skills uses YAML frontmatter to structure skill metadata.

References:

Skills Toolset Integration

Added SkillsToolset and supporting types (Skill, SkillMetadata, SkillResource, SkillScript, etc.) to the main package exports, enabling agents to discover and use skills dynamically. [1] [2] [3] [4] [5]
Created pydantic_ai/toolsets/skills/__init__.py with documentation and examples for building and managing agent skills.

Documentation and API Reference Updates

Updated API docs (docs/api/toolsets.md) and navigation (mkdocs.yml) to include the new Skills toolset and its members, ensuring clear guidance for users. [1] [2]
Added a comprehensive skill documentation example for Pydantic AI (pydanticai-docs/SKILL.md), detailing framework features and usage patterns.

Skill Example Implementation

Added an example skill for searching arXiv (arxiv-search/SKILL.md and arxiv_search.py), including usage instructions, argument descriptions, and output formatting. [1] [2]

Agent Example with Skills

Provided a new example (skills_agent.py) demonstrating how to create an agent with skills, list available skills, load instructions, and execute skill scripts.

These changes collectively enable agents to leverage domain-specific skills, improve extensibility, and provide clear documentation and examples to help users get started.

Copilot

Pull request overview

This pull request introduces a comprehensive Skills System to Pydantic AI, enabling agents to dynamically discover and utilize modular skill packages. The implementation follows Anthropic's Agent Skills patterns and provides progressive disclosure of capabilities through a standardized toolset interface.

Key Changes:

Added SkillsToolset with four core tools: list_skills(), load_skill(), read_skill_resource(), and run_skill_script()
Introduced skill discovery from filesystem with YAML frontmatter parsing for metadata
Implemented security measures including path traversal prevention and script execution timeouts
Added PyYAML as a required dependency to pydantic-ai-slim for parsing skill metadata

Reviewed changes

Copilot reviewed 16 out of 17 changed files in this pull request and generated 12 comments.

Show a summary per file

File	Description
`uv.lock`	Added PyYAML 6.0+ dependency lock for YAML frontmatter parsing
`pydantic_ai_slim/pyproject.toml`	Added PyYAML dependency to project requirements
`tests/test_skills.py`	Comprehensive test suite covering skill discovery, parsing, validation, and toolset integration (900 lines)
`pydantic_ai_slim/pydantic_ai/toolsets/skills/_types.py`	Type definitions for Skill, SkillMetadata, SkillResource, and SkillScript dataclasses
`pydantic_ai_slim/pydantic_ai/toolsets/skills/_exceptions.py`	Custom exception classes for skill operations
`pydantic_ai_slim/pydantic_ai/toolsets/skills/_discovery.py`	Skill discovery, YAML parsing, and validation logic
`pydantic_ai_slim/pydantic_ai/toolsets/skills/_toolset.py`	Main SkillsToolset implementation with tool registration and execution
`pydantic_ai_slim/pydantic_ai/toolsets/skills/__init__.py`	Module exports and documentation
`pydantic_ai_slim/pydantic_ai/toolsets/__init__.py`	Added Skills toolset exports to main toolsets module
`pydantic_ai_slim/pydantic_ai/__init__.py`	Added SkillsToolset to main package exports
`mkdocs.yml`	Added skills.md documentation page to navigation
`examples/pydantic_ai_examples/skills_agent.py`	Example demonstrating Skills integration with an agent
`examples/pydantic_ai_examples/skills/pydanticai-docs/SKILL.md`	Example skill providing Pydantic AI framework documentation
`examples/pydantic_ai_examples/skills/arxiv-search/scripts/arxiv_search.py`	Example Python script for arXiv paper search
`examples/pydantic_ai_examples/skills/arxiv-search/SKILL.md`	Example skill for searching arXiv repository
`docs/skills.md`	Comprehensive documentation covering skill creation, usage patterns, and API reference (535 lines)
`docs/api/toolsets.md`	API documentation updates for Skills toolset types and functions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pydantic_ai_slim/pydantic_ai/toolsets/skills/_toolset.py

docs/skills.md

tests/test_skills.py

docs/skills.md

tests/test_skills.py

DougTrajano · 2025-12-19T22:00:52Z

@DouweM here is the PR to introduce agent skills natively in the Pydantic AI. I did some more refactoring to align with the Pydantic AI codebase.

Question:

To effectively use Agent Skills, their definitions must be added to the system prompt. Currently, the dev experience is:

from pydantic_ai import Agent, SkillsToolset

# Initialize Skills Toolset with skill directories
skills_toolset = SkillsToolset(directories=["./skills"])

# Create agent with skills
agent = Agent(
    model='openai:gpt-4o',
    instructions="You are a helpful research assistant.",
    toolsets=[skills_toolset]
)

# Developer must explictly add skills system prompt using our helper function
@agent.system_prompt
async def add_skills_to_system_prompt() -> str:
    return skills_toolset.get_skills_system_prompt()

# Use agent - skills tools are automatically available
result = await agent.run(
    "What are the last 3 papers on arXiv about machine learning?"
)
print(result.output)

I suggest adding an optional get_system_prompt() method to the AbstractToolset interface. Modify the agent's system prompt collection flow in _agent_graph.py to automatically collect prompts from toolsets, then I can override this method in the SkillsToolset to return its skills prompt.

Co-authored-by: Copilot <[email protected]>

DouweM · 2025-12-19T22:34:17Z

I suggest adding an optional get_system_prompt() method to the AbstractToolset interface. Modify the agent's system prompt collection flow in _agent_graph.py to automatically collect prompts from toolsets, then I can override this method in the SkillsToolset to return its skills prompt.

@DougTrajano Good idea, call it get_instructions and have it take the run_context please, like get_tools!

Note that we also have an instructions field on MCPServer already: #3431. We could start returning that from get_instructions(), but I think it'd have to be opt-in with a flag so it's not a potentially-surprising/undesirable change in behavior.

I'll give the PR a proper review on Monday, thanks for working on this!

Copilot

Pull request overview

Copilot reviewed 16 out of 17 changed files in this pull request and generated 10 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pydantic_ai_slim/pydantic_ai/toolsets/skills/_toolset.py

docs/skills.md

pydantic_ai_slim/pydantic_ai/toolsets/skills/_discovery.py

tests/test_skills.py

pydantic_ai_slim/pydantic_ai/toolsets/skills/_toolset.py

pydantic_ai_slim/pydantic_ai/toolsets/__init__.py

pydantic_ai_slim/pydantic_ai/toolsets/skills/_discovery.py

pydantic_ai_slim/pydantic_ai/toolsets/skills/_toolset.py

DouweM

Couldn't help myself and left a few quick comments ahead of a full review next week :) Main point is that I'd like this to be less hard-coded to use the local environment

DouweM · 2025-12-19T22:44:00Z

pydantic_ai_slim/pydantic_ai/toolsets/skills/_toolset.py

+
+    def __init__(
+        self,
+        directories: list[str | Path],


I don't want this to be hard-coded to having skills on the local file system. Can we support programmatically passing in skills?

The Anthropic API may be useful for seeing how they represent skills, since they obviously can't actually do local file reads

Hmm, good suggestion. Maybe we should create a SkillSource concept that interacts with SkillsToolset via standard methods such as discover(), read_resource(), etc.

This allows us to have LocalSkillSource, RemoteSkillSource, etc. It also allows developers implement their own logic.

It will take some time, but I believe I can give it to you as a Christmas gift. 🎅🏼

Just a thought from the peanut gallery, appreciate all the work getting done on this PR.

I think there really will need to have the ability for multiple sources of skills because I think workflows where you mix/match those sources can be pretty common for prompts referencing multiple skills. If you look at the skills Anthropic released out of the box they could be pretty useful pieces of boilerplate that are happily referenced from a hosted source (currently Claude API).

If my prompt is "can you update a report using the custom-business-report skill and then create a pdf using the pdf skill". The PDF skill is useful boilerplate that I don't want to pull and package alongside by custom skill. What I don't quite understand yet and worries me is how effective an agent is going to be if it needs to go back and forth between a tool-calling implementation of skills (Pydantic AI) and a filesystem implementation (Anthropic/Claude native) in a single loop.

Do you anticipate references in SKILL.md to just be paths to the reference files like in Anthropic's implementation? Or would it be "use read_skill_resource tool to read /path/to/resource"? It would be great if you could just list relative paths like in Anthropic's and the agent just somehow knew it was working with a filesystem implementation or a tool-calling implementation and act appropriately. It would mean skills are fully portable between the two implementations without modification.

Do you anticipate references in SKILL.md to just be paths to the reference files like in Anthropic's implementation? Or would it be "use read_skill_resource tool to read /path/to/resource"? It would be great if you could just list relative paths like in Anthropic's and the agent just somehow knew it was working with a filesystem implementation or a tool-calling implementation and act appropriately. It would mean skills are fully portable between the two implementations without modification.

Check my proposal in this comment #3780 (comment)

Essentially, we will implement an interface for SkillsSource that handles the logic for reading resources when the SkillsToolset requires them. With this approach, we should be able to support file paths or even URLs. Developers can also define custom logic by modifying read_skill_resource() as needed.

This sounds like a great approach. Having fully portable skills that work across different skill supporting clients is important. Filesystem implementation is effectively the standard because that's how this started but tool based implementation offers the benefit of remotely hosted (centralized) skill repositories. Things get messy if we have to tweak the SKILL.md and supporting resources if you move between a filesystem client and tools client... it's gotta be the same SKILL.md for both. Appreciate the work here!

It would be great if you could just list relative paths like in Anthropic's and the agent just somehow knew it was working with a filesystem implementation or a tool-calling implementation and act appropriately.

Note that the filesystem implementation looks like a tool calling implementation to the LLM as well: reading a file is calling a tool like read_file(...) or run_sh('cat ...'). So if we give our skill tools names and descriptions that the model will understand are equivalent (like read_skill_file), it should do just as well.

It would mean skills are fully portable between the two implementations without modification.

I agree that should be the case!

it needs to go back and forth between a tool-calling implementation of skills (Pydantic AI) and a filesystem implementation (Anthropic/Claude native) in a single loop.

@jbnitorum Can you elaborate on what you mean here, and what the specific thing is that you're concerned about Pydantic AI handling well?

Do you mean that during a single agent run, some skills should be handled by Pydantic AI, and some should be handled by Anthropic? I imagine the model would get very confused having 2 competing sources of and tools around skills. I imagine that it would be more like #3212, where Pydantic AI would decide based on model capabilities whether to use the framework or API implementation.

Yes that was what I was implying, but taking a step back I think its better solved with a different approach.

I was highlighting the usefulness of some of the built-in skills and thinking about it conceptually through more of a Claude Web/Desktop lens where all enabled skills get exposed to the system prompt natively. When accessing via the API though any skills available in the agent run must be explicitly defined, even the built in ones. So the simplistic built in file type skills aren't just "magically" always there like in the chat clients.

Given that an explicit definition needs to be present anyway it seems much more simplistic to just grab the necessary built-in skills off Github at runtime and treat them like any other local skill as opposed to try to orchestrate something where some of the skills are coming from Pydantic AI and others are within calls to the Anthropic API. I think its a pretty safe bet that any built in skills will continue to be open sourced on github.

A helper function to dig skills out of repos would work, used like below. I'll leave it to @DougTrajano if he has a more elegant solution, maybe build something into SkillSource itself for common remote storage types, Github, S3, etc.

from pydantic_ai.toolsets.skills import SkillsSource, SkillsToolset, GitHubSkill skills_source = SkillsSource( sources=[GitHubSkill(repo='anthropics/skills', path='skills/pdf', dest='local_skills/pdf', commit='abc123')], executor=sys.executable ) skills_toolset = SkillsToolset( sources=[skills_source] )

DouweM · 2025-12-19T22:49:31Z

pydantic_ai_slim/pydantic_ai/toolsets/skills/_toolset.py

+        validate: bool = True,
+        id: str | None = None,
+        script_timeout: int = 30,
+        python_executable: str | Path | None = None,


I think if this is None, we should NOT run scripts, as running them on the local system is potentially dangerous and requires the user to know what they're doing.

It could also be useful to let the user provide a function to execute the tool (instead of a Python path?) so that they can plug in some other (possibly remote, possibly sandboxed) execution environment.

Ok, good point. I think we should design it as a pluggable system with three main components:

SkillsToolset: Interface between agent and skills source.

SkillsSource: Previously directories, it will be an interface that we can extend for LocalSkillsSource, RemoteSkillsSource, etc.

It provides standard methods to interact with SkillsToolset such as list_skills(), load_skill(), etc.

SkillScriptExecutor: An execution environment representation (can be a local Python path as today, remote, or sandboxed environment) associated with a given SkillsSource.

The SkillScriptExecutor receives the script invocation from SkillsToolset.run_skill_script() and executes it.

Limitation: We will not support different SkillScriptExecutor in the same SkillsSource, I don't think it's required, and it increases the complexity because we need to have it defined inside each skill. I don't want to overwhelm it to handle just a few markdown and Python Scripts.

If you require multiple SkillScriptExecutor instances within the same SkillsSource, consider reorganizing your skills. It might be beneficial to split them into two separate SkillsSource objects, enabling you to define two different execution environments.

What do you think?

flowchart TB subgraph Agent["Agent Runtime"] A[Agent] --> ST[SkillsToolset] end subgraph Toolset["SkillsToolset"] ST --> |"delegates to appropriate source"| Router{Route by skill_name} end subgraph Source1["LocalSkillsSource"] Router --> LSS[LocalSkillsSource] LSS --> |"list_skills()"| FileSystem1[(File System)] LSS --> |"load_skill()"| FileSystem1 LSS --> |"read_skill_resource()"| FileSystem1 LSS --> |"run_skill_script()"| LSE[LocalSkillScriptExecutor] LSE --> Sub[anyio.run_process] end subgraph Source2["RemoteSkillsSource"] Router --> RSS[RemoteSkillsSource] RSS --> |"list_skills()"| API[(Remote API)] RSS --> |"load_skill()"| API RSS --> |"read_skill_resource()"| API RSS --> |"run_skill_script()"| RSE[RemoteSkillScriptExecutor] RSE --> RAPI[Remote Execution API] end subgraph Source3["CustomSkillSource"] Router --> CSS[CustomSkillsSource] CSS --> |"run_skill_script()"| SBE[SandboxedExecutor] SBE --> Container[Docker/WASM] end

Loading

That mostly makes sense, but I have a few questions that might be cleared up by just seeing the code that makes this work the way you envision :)

Besides skill sources (which I agree are useful for local and remote URL-based discovery) I'd expect to also be able to build my own Skill objects and pass those in somehow

Why is the executor tied to the skill source and does each skill source possibly have a separate executor? I see the source and the executor as orthogonal, 2 separate things pluggable into the toolset, but otherwise not tied together, but I may be missing something.

Do we need a full executor class, or could it be a simple callable taking the script to be run?

If the scripts need an execution environment that need to be stable across runs, we should pass the RunContext into the callable so it can use deps to read things like container_id

I wonder if we could design the executors to also work with Programmatic Tool Calling (i.e. CodeAct, Code Mode), which similarly needs a pluggable execution environment, but also needs a way to inject code context (i.e. Python functions that will be available) + let the code return tool calls back to the agent.

Hey buddy, here are my comments:

Let me implement this in some way.

By tying the execution environment to the SkillsSource, we can support multiple execution environments without requiring a Pydantic-specific definition of agent skills. Suppose I have a local folder named ./skills with these two skills: pdf and xlsx.

The easier way to use them is just read the entire ./skills folder.

from pydantic_ai.toolsets.skills import SkillsSource, SkillsToolset skills_source = SkillsSource( sources=["./skills"], executor=sys.executable # default to using the current Python interpreter ) skills_toolset = SkillsToolset( sources=[skills_source] ) # both pdf and xlsx skills are available within the skills_toolset # in the same execution environment

However, suppose I want to execute the PDF skills in a different environment (e.g., a Docker container with specific libraries installed). In that case, I just need to separate the skills into two sources. It does not require any change in the skills files to accomplish this.

import sys from pydantic_ai.toolsets.skills import SkillsSource, SkillsToolset def pdf_executor(skill, script, args): # Custom logic to execute PDF skills pass pdf_source = SkillsSource( sources=["./skills/pdf"], executuor=pdf_executor ) xlsx_source = SkillsSource( sources=["./skills/xlsx"], executor=sys.executable # default to using the current Python interpreter ) skills_toolset = SkillsToolset( sources=[pdf_source, xlsx_source] ) # PDF skills are executed using pdf_executor, # while XLSX skills use the current Python interpreter

We will provide an executor protocol and two classes: LocalSkillExecutor and CallableSkillScriptExecutor. The SkillsToolset will accept either the SkillScriptExecutor protocol, such as LocalSkillExecutor, or a simple callable/function with specific arguments: skill, script, and args. We will then wrap this function into the CallableSkillScriptExecutor.

I think that advanced use cases may require some dependencies (secrets, reusable HTTP clients, etc.). I will add the context to the executor then.

We should be able to accomplish this by creating a new executor class. I am currently developing the proposal above. I will commit it soon. I believe it's better to handle it in a separate PR by extending SkillScriptExecutor.

DouweM · 2025-12-19T22:51:05Z

pydantic_ai_slim/pydantic_ai/toolsets/skills/_toolset.py

+            return ''
+
+        lines = [
+            '# Skills',


Default prompts should live in constants at the top of the file and should be overridable, either through the constructor, or at least the in-development #3656

Perfect, it's ready for your review.

Co-authored-by: Copilot <[email protected]>

DougTrajano · 2025-12-20T14:42:38Z

I suggest adding an optional get_system_prompt() method to the AbstractToolset interface. Modify the agent's system prompt collection flow in _agent_graph.py to automatically collect prompts from toolsets, then I can override this method in the SkillsToolset to return its skills prompt.

@DougTrajano Good idea, call it get_instructions and have it take the run_context please, like get_tools!

Note that we also have an instructions field on MCPServer already: #3431. We could start returning that from get_instructions(), but I think it'd have to be opt-in with a flag so it's not a potentially-surprising/undesirable change in behavior.

I'll give the PR a proper review on Monday, thanks for working on this!

Perfect, it's done. I also updated the MCPServer instruction property to include a deprecation message to use the get_instructions() function instead.

DougTrajano · 2025-12-22T01:45:50Z

just for ref

PR that introduced Agent Skills in vscode microsoft/vscode#278445

add agent skills from pydantic-ai-skills

ab542ed

Copilot AI review requested due to automatic review settings December 19, 2025 21:49

Copilot started reviewing on behalf of DougTrajano December 19, 2025 21:49 View session

Copilot AI reviewed Dec 19, 2025

View reviewed changes

DougTrajano and others added 10 commits December 19, 2025 19:02

remove logging implementation

3e69b2d

Merge branch 'main' into DEV-1099

2bb5093

Update tests/test_skills.py

998a324

Co-authored-by: Copilot <[email protected]>

Update tests/test_skills.py

9924c19

Co-authored-by: Copilot <[email protected]>

Update tests/test_skills.py

140847b

Co-authored-by: Copilot <[email protected]>

Update tests/test_skills.py

079a434

Co-authored-by: Copilot <[email protected]>

Update tests/test_skills.py

611526d

Co-authored-by: Copilot <[email protected]>

Update tests/test_skills.py

2ac5779

Co-authored-by: Copilot <[email protected]>

Update tests/test_skills.py

f02cac3

Co-authored-by: Copilot <[email protected]>

update skills docs per copilot suggestions

028cfb4

DouweM self-assigned this Dec 19, 2025

DougTrajano requested a review from Copilot December 19, 2025 22:35

Copilot started reviewing on behalf of DougTrajano December 19, 2025 22:35 View session

Copilot AI reviewed Dec 19, 2025

View reviewed changes

DouweM requested changes Dec 19, 2025

View reviewed changes

DouweM added the awaiting author revision label Dec 19, 2025

DougTrajano and others added 4 commits December 20, 2025 08:54

Merge branch 'main' into DEV-1099

f4648b5

Update docs/skills.md

5ea1d01

Co-authored-by: Copilot <[email protected]>

add max_depth

bde5cab

add max_depth

852df1c

DougTrajano force-pushed the DEV-1099 branch from fa37900 to 852df1c Compare December 20, 2025 12:36

DougTrajano and others added 3 commits December 20, 2025 10:00

refactor validate_skill_metadata func

fb84c07

Update pydantic_ai_slim/pydantic_ai/toolsets/skills/_toolset.py

d8dd953

Co-authored-by: Copilot <[email protected]>

refactor skills instructions and make it customizable

bbd8741

update docs

9dbc4be

DougTrajano mentioned this pull request Dec 20, 2025

Support Anthropic Skills built-in tool #3365

Open

Wh1isper mentioned this pull request Dec 23, 2025

Support Skills with any model #3781

Open

dsfaccini mentioned this pull request Dec 24, 2025

Skills: Progressive Disclosure for Agent Capabilities #3838

Closed

Introduce support for agent skills #3780

Are you sure you want to change the base?

Introduce support for agent skills #3780

Conversation

DougTrajano commented Dec 19, 2025 • edited by DouweM Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DougTrajano commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DouweM commented Dec 19, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DouweM left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DougTrajano Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DougTrajano Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DougTrajano Dec 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DougTrajano commented Dec 20, 2025

Uh oh!

DougTrajano commented Dec 19, 2025 •

edited by DouweM

Loading

DougTrajano commented Dec 19, 2025 •

edited

Loading

DougTrajano Dec 20, 2025 •

edited

Loading

DougTrajano Dec 20, 2025 •

edited

Loading

DougTrajano Dec 20, 2025 •

edited

Loading