-
Notifications
You must be signed in to change notification settings - Fork 12
Add Skills capability for progressive tool loading #183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
DouweM
wants to merge
4
commits into
main
Choose a base branch
from
capability/skills
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
eeb2b07
Add Skills capability for progressive tool loading
DouweM f6d50e5
Address audit findings: unload_skill, fuzzy search, agentskills.io to…
DouweM f0927a2
Fix coverage: add tests for frontmatter parsing and FunctionToolset b…
DouweM e172c2a
Restructure skills module into package following project conventions …
mtessar File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,75 @@ | ||
| # Skills Capability | ||
|
|
||
| Closes #22. Partially addresses #40 (deferred tool loading via search + load pattern). | ||
|
|
||
| ## Summary | ||
|
|
||
| Implements a `Skills` capability that enables progressive tool loading: agents discover | ||
| skills via `search_skills(query)` and activate them via `load_skill(name)`, keeping | ||
| unloaded tools hidden from the model's context window. | ||
|
|
||
| ## Design | ||
|
|
||
| ### `Skill` dataclass | ||
|
|
||
| A skill bundles a name, description, optional instructions, and a set of tools: | ||
|
|
||
| ```python | ||
| Skill( | ||
| name='math', # lowercase, hyphens allowed | ||
| description='Arithmetic', # shown in search results | ||
| tools=[add, subtract], # callables or a FunctionToolset | ||
| instructions='...', # included when skill is loaded | ||
| ) | ||
| ``` | ||
|
|
||
| Tools can be plain callables (registered as `tool_plain`) or a pre-built | ||
| `FunctionToolset` for more control. | ||
|
|
||
| ### `Skills` capability | ||
|
|
||
| An `AbstractCapability` subclass providing: | ||
|
|
||
| | Method | Purpose | | ||
| |--------|---------| | ||
| | `get_instructions()` | Tells the agent about the skill catalog | | ||
| | `get_toolset()` | Registers `search_skills`, `load_skill`, and all skill tools | | ||
| | `prepare_tools()` | Hides tools from unloaded skills each model request | | ||
| | `for_run()` | Isolates per-run loaded-skills state | | ||
| | `from_spec(dirs=[...])` | Loads markdown skills from directories (Tier S serializable) | | ||
|
|
||
| ### Markdown skill files | ||
|
|
||
| Skills can be defined as `.md` files with YAML frontmatter: | ||
|
|
||
| ```markdown | ||
| --- | ||
| name: my-skill | ||
| description: Does something useful | ||
| --- | ||
| Detailed instructions for the agent... | ||
| ``` | ||
|
|
||
| Loaded via `load_skills_from_directory(path)` or `Skills.from_spec(dirs=[path])`. | ||
| Markdown skills are pure knowledge (no tools) -- pair with Python skills as needed. | ||
|
|
||
| ### Progressive disclosure flow | ||
|
|
||
| 1. Agent sees `search_skills` and `load_skill` tools (always visible) | ||
| 2. Agent calls `search_skills("math")` -- gets name + description matches | ||
| 3. Agent calls `load_skill("math")` -- gets instructions + tool names, tools become visible | ||
| 4. Agent can now call `add(1, 2)` etc. | ||
|
|
||
| ## Files changed | ||
|
|
||
| - `src/pydantic_harness/skills.py` -- `Skill`, `Skills`, `load_skills_from_directory` | ||
| - `src/pydantic_harness/__init__.py` -- exports | ||
| - `tests/test_skills.py` -- 47 tests covering all code paths | ||
| - `pyproject.toml` -- pyright test override for private usage | ||
|
|
||
| ## Prior art considered | ||
|
|
||
| - vstorm-co/pydantic-ai-skills (SkillsCapability + SkillsToolset) | ||
| - OpenAI Agents SDK ToolSearchTool | ||
| - Anthropic Tool Search | ||
| - Microsoft Agent Framework SKILL.md spec |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| """Skills capability for progressive tool loading.""" | ||
|
|
||
| from pydantic_harness.skills._capability import Skills | ||
| from pydantic_harness.skills._toolset import Skill, load_skills_from_directory | ||
|
|
||
| __all__ = ['Skill', 'Skills', 'load_skills_from_directory'] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,212 @@ | ||
| """Skills capability -- progressive skill discovery and loading.""" | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| from dataclasses import dataclass, field | ||
| from typing import Any | ||
|
|
||
| from pydantic_ai._run_context import AgentDepsT, RunContext | ||
| from pydantic_ai.capabilities.abstract import AbstractCapability | ||
| from pydantic_ai.tools import ToolDefinition | ||
| from pydantic_ai.toolsets.function import FunctionToolset | ||
|
|
||
| from pydantic_harness.skills._toolset import Skill, load_skills_from_directory | ||
|
|
||
|
|
||
| @dataclass | ||
| class Skills(AbstractCapability[AgentDepsT]): | ||
| """Capability for progressive skill discovery and loading. | ||
|
|
||
| Provides ``search_skills``, ``load_skill``, and ``unload_skill`` | ||
| meta-tools. Tools belonging to registered skills are hidden until | ||
| the agent explicitly loads the skill that owns them. | ||
|
|
||
| Per-run state (which skills are loaded) is isolated via | ||
| :meth:`for_run`. | ||
|
|
||
| Example:: | ||
|
|
||
| from pydantic_ai import Agent | ||
| from pydantic_harness.skills import Skill, Skills | ||
|
|
||
| def add(a: int, b: int) -> int: | ||
| \"\"\"Add two numbers.\"\"\" | ||
| return a + b | ||
|
|
||
| math_skill = Skill( | ||
| name='math', | ||
| description='Basic arithmetic operations', | ||
| tools=[add], | ||
| ) | ||
| agent = Agent('openai:gpt-4o', capabilities=[Skills(skills=[math_skill])]) | ||
| """ | ||
|
|
||
| skills: list[Skill] = field(default_factory=lambda: list[Skill]()) | ||
| """Registered skills.""" | ||
|
|
||
| _loaded_skill_names: set[str] = field(default_factory=lambda: set[str](), init=False, repr=False) | ||
| """Names of skills that have been loaded in the current run (per-run state).""" | ||
|
|
||
| @classmethod | ||
| def get_serialization_name(cls) -> str | None: # noqa: D102 | ||
| return 'Skills' | ||
|
|
||
| @classmethod | ||
| def from_spec(cls, *args: Any, **kwargs: Any) -> Skills[Any]: | ||
| """Create from spec arguments. | ||
|
|
||
| Accepts ``dirs`` (list of directory paths) to load markdown skills. | ||
| """ | ||
| dirs: list[str] = kwargs.pop('dirs', []) or list(args) | ||
| all_skills: list[Skill] = [] | ||
| for d in dirs: | ||
| all_skills.extend(load_skills_from_directory(d)) | ||
| return cls(skills=all_skills, **kwargs) | ||
|
|
||
| async def for_run(self, ctx: RunContext[AgentDepsT]) -> Skills[AgentDepsT]: | ||
| """Return a fresh copy with empty loaded-skills state.""" | ||
| clone: Skills[AgentDepsT] = Skills(skills=self.skills) | ||
| return clone | ||
|
|
||
| def get_instructions(self) -> str | None: | ||
| """Provide baseline instructions for skill discovery.""" | ||
| if not self.skills: | ||
| return None | ||
| return ( | ||
| 'You have access to a skill catalog. ' | ||
| 'Use `search_skills` to find relevant skills by keyword, ' | ||
| 'then `load_skill` to activate a skill and make its tools available. ' | ||
| 'Use `unload_skill` when you no longer need a skill, to free context. ' | ||
| "Only loaded skills' tools appear in your tool list." | ||
| ) | ||
|
|
||
| def get_toolset(self) -> FunctionToolset[AgentDepsT] | None: | ||
| """Build the toolset containing meta-tools and all skill tools.""" | ||
| if not self.skills: | ||
| return None | ||
|
|
||
| toolset: FunctionToolset[AgentDepsT] = FunctionToolset() | ||
|
|
||
| # Register meta-tools | ||
| toolset.add_function(self._search_skills, takes_ctx=False, name='search_skills') | ||
| toolset.add_function(self._load_skill, takes_ctx=False, name='load_skill') | ||
| toolset.add_function(self._unload_skill, takes_ctx=False, name='unload_skill') | ||
|
|
||
| # Register each skill's tools (they will be hidden until loaded) | ||
| for skill in self.skills: | ||
| if isinstance(skill.tools, FunctionToolset): | ||
| for tool in skill.tools.tools.values(): | ||
| toolset.add_tool(tool) | ||
| else: | ||
| for fn in skill.tools: | ||
| toolset.add_function(fn, takes_ctx=False) | ||
|
|
||
| return toolset | ||
|
|
||
| async def prepare_tools( | ||
| self, | ||
| ctx: RunContext[AgentDepsT], | ||
| tool_defs: list[ToolDefinition], | ||
| ) -> list[ToolDefinition]: | ||
| """Hide tools belonging to skills that have not been loaded yet.""" | ||
| # Build set of tool names that should be hidden | ||
| hidden: set[str] = set() | ||
| for skill in self.skills: | ||
| if skill.name not in self._loaded_skill_names: | ||
| hidden.update(skill.tool_names()) | ||
|
|
||
| # Always keep meta-tools visible | ||
| meta_tools = {'search_skills', 'load_skill', 'unload_skill'} | ||
|
|
||
| return [td for td in tool_defs if td.name in meta_tools or td.name not in hidden] | ||
|
|
||
| # -- Meta-tool implementations -- | ||
|
|
||
| def _search_skills(self, query: str) -> list[dict[str, str]]: | ||
| """Search available skills by keyword. | ||
|
|
||
| Returns a list of matching skills with their name, description, | ||
| and whether they are currently loaded, ranked by relevance. | ||
|
|
||
| The query is split into words and each word is matched | ||
| case-insensitively against the skill name and description. | ||
| Skills matching at least one word are returned, ordered by the | ||
| number of matching words (most relevant first). | ||
|
|
||
| Args: | ||
| query: A keyword or phrase to search for in skill names and descriptions. | ||
| """ | ||
| words = query.lower().split() | ||
| if not words: | ||
| return [] | ||
|
|
||
| scored: list[tuple[int, Skill]] = [] | ||
| for skill in self.skills: | ||
| haystack = f'{skill.name} {skill.description}'.lower() | ||
| matches = sum(1 for w in words if w in haystack) | ||
| if matches: | ||
| scored.append((matches, skill)) | ||
|
|
||
| # Sort by match count descending, then by name for stability | ||
| scored.sort(key=lambda pair: (-pair[0], pair[1].name)) | ||
|
|
||
| return [ | ||
| { | ||
| 'name': skill.name, | ||
| 'description': skill.description, | ||
| 'loaded': 'yes' if skill.name in self._loaded_skill_names else 'no', | ||
| } | ||
| for _, skill in scored | ||
| ] | ||
|
|
||
| def _load_skill(self, name: str) -> str: | ||
| """Load a skill by name, making its tools available. | ||
|
|
||
| After loading, the skill's tools will appear in subsequent tool | ||
| lists and any associated instructions will be included. | ||
|
|
||
| Args: | ||
| name: The exact name of the skill to load (as returned by ``search_skills``). | ||
| """ | ||
| skill = self._find_skill(name) | ||
| if skill is None: | ||
| available = ', '.join(s.name for s in self.skills) | ||
| return f'Skill {name!r} not found. Available skills: {available}' | ||
|
|
||
| self._loaded_skill_names.add(name) | ||
|
|
||
| parts = [f'Skill {name!r} loaded.'] | ||
| tool_names = skill.tool_names() | ||
| if tool_names: | ||
| parts.append(f'Available tools: {", ".join(tool_names)}') | ||
| if skill.instructions: | ||
| parts.append(f'Instructions:\n{skill.instructions}') | ||
| return '\n'.join(parts) | ||
|
|
||
| def _unload_skill(self, name: str) -> str: | ||
| """Unload a skill by name, removing its tools from the context. | ||
|
|
||
| Use this when you no longer need a skill's tools, to free up | ||
| space in the context window. | ||
|
|
||
| Args: | ||
| name: The exact name of the skill to unload. | ||
| """ | ||
| skill = self._find_skill(name) | ||
| if skill is None: | ||
| available = ', '.join(s.name for s in self.skills) | ||
| return f'Skill {name!r} not found. Available skills: {available}' | ||
|
|
||
| if name not in self._loaded_skill_names: | ||
| return f'Skill {name!r} is not currently loaded.' | ||
|
|
||
| self._loaded_skill_names.discard(name) | ||
| return f'Skill {name!r} unloaded. Its tools are no longer available.' | ||
|
|
||
| # -- Helpers -- | ||
|
|
||
| def _find_skill(self, name: str) -> Skill | None: | ||
| for skill in self.skills: | ||
| if skill.name == name: | ||
| return skill | ||
| return None | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚩 for_run clone shares skills list reference — meta-tool binding depends on get_ re-extraction*
The
for_runmethod atsrc/pydantic_harness/skills.py:200creates a clone viaSkills(skills=self.skills), sharing the sameskillslist by reference. The critical design question is whetherget_toolset()is re-called on the clone afterfor_run. TheAbstractCapability.for_rundocstring says it is "Called once per run, before get_*() re-extraction", which impliesget_toolset()is indeed re-invoked on the clone. If so, the meta-tools (_search_skills,_load_skill,_unload_skill) would be bound methods of the clone, and_loaded_skill_namesmutations during a run would correctly affect the same instance thatprepare_toolschecks. This is correct under the documented lifecycle. However, if any future pydantic-ai version changes the re-extraction behavior, this would silently break — the meta-tools would mutate the original instance's state whileprepare_toolsreads from the clone's state, making skill loading appear to have no effect.Was this helpful? React with 👍 or 👎 to provide feedback.