Skip to content

Add Skills capability for progressive tool loading#183

Draft
DouweM wants to merge 4 commits intomainfrom
capability/skills
Draft

Add Skills capability for progressive tool loading#183
DouweM wants to merge 4 commits intomainfrom
capability/skills

Conversation

@DouweM
Copy link
Copy Markdown
Contributor

@DouweM DouweM commented Apr 10, 2026

Summary

  • Implements a Skills capability (AbstractCapability subclass) that enables progressive tool loading: agents discover skills via search_skills(query) and activate them via load_skill(name), keeping unloaded tools hidden from the model's context window
  • Skills can be defined in Python (with callable tools or FunctionToolset) or loaded from markdown files with YAML frontmatter (pure knowledge packages)
  • Per-run state isolation via for_run(), spec-serializable via from_spec(dirs=[...]), and tool visibility controlled through the prepare_tools hook

Closes #22. Partially addresses #40.

Test plan

  • 47 new tests covering all code paths (48 total with existing placeholder test)
  • Skill dataclass: name validation, tool name extraction, defaults
  • Markdown parsing: frontmatter extraction, error cases, multi-line bodies
  • Directory loading: .md discovery, non-md filtering, empty dirs
  • Skills capability: instructions, toolset assembly, serialization
  • Meta-tools: search matching/case-insensitivity, load success/failure, loaded status
  • prepare_tools: hides unloaded skill tools, shows loaded, preserves non-skill tools
  • for_run: state isolation between runs
  • from_spec: directory-based construction
  • All passing: ruff lint, ruff format, pyright strict, pytest

🤖 Generated with Claude Code

DouweM and others added 3 commits April 2, 2026 05:27
Implements a Skills capability (AbstractCapability subclass) that lets agents
discover and load skill packages on demand, preserving context window by
hiding unloaded tools. Provides search_skills and load_skill meta-tools,
supports both Python-defined and markdown-based skills.

Closes #22. Partially addresses #40.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lerance

- Add `unload_skill(name)` meta-tool to remove a skill's tools from the
  loaded set, freeing context window space
- Improve `search_skills` with word-boundary matching: split query into
  words, match each against name/description, rank results by match count
- Document that unknown frontmatter keys are silently ignored for
  agentskills.io compatibility (already worked, now explicit + tested)
- Add 9 new tests covering all new behavior

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ranch

- Add test for frontmatter lines without colon (line 134)
- Add test for get_toolset with FunctionToolset skills (lines 230-231)
- Mark tool stub functions as `# pragma: no cover` (never called, only registered)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

View 4 additional findings in Devin Review.

Open in Devin Review

Comment thread src/pydantic_harness/skills.py Outdated
raise ValueError(f'Missing YAML frontmatter in {source}')

# Find closing delimiter
end = stripped.find('---', 3)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Frontmatter closing delimiter search matches --- inside values, not just on its own line

_parse_skill_markdown at src/pydantic_harness/skills.py:117 uses stripped.find('---', 3) to locate the closing frontmatter delimiter. This performs a simple substring search, so it will match --- appearing within a frontmatter value (e.g., description: A---B) rather than requiring --- to be on its own line, which is the standard YAML frontmatter convention. When triggered, the frontmatter is truncated at the embedded ---, the description (or other value) is silently cut short, and the remainder is incorrectly treated as the body/instructions.

Example of incorrect parse

Input:

---
name: my-skill
description: Long---description
---
Body text

find('---', 3) matches the --- inside Long---description at character 34 instead of the actual closing delimiter. Result: description is parsed as Long, and body becomes description\n---\nBody text.

Suggested change
end = stripped.find('---', 3)
end = stripped.find('\n---', 3)
if end == -1:
raise ValueError(f'Unclosed YAML frontmatter in {source}')
frontmatter_text = stripped[3:end].strip()
body = stripped[end + 4 :].strip() or None
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +198 to +201
async def for_run(self, ctx: RunContext[AgentDepsT]) -> Skills[AgentDepsT]:
"""Return a fresh copy with empty loaded-skills state."""
clone: Skills[AgentDepsT] = Skills(skills=self.skills)
return clone
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 for_run clone shares skills list reference — meta-tool binding depends on get_ re-extraction*

The for_run method at src/pydantic_harness/skills.py:200 creates a clone via Skills(skills=self.skills), sharing the same skills list by reference. The critical design question is whether get_toolset() is re-called on the clone after for_run. The AbstractCapability.for_run docstring says it is "Called once per run, before get_*() re-extraction", which implies get_toolset() is indeed re-invoked on the clone. If so, the meta-tools (_search_skills, _load_skill, _unload_skill) would be bound methods of the clone, and _loaded_skill_names mutations during a run would correctly affect the same instance that prepare_tools checks. This is correct under the documented lifecycle. However, if any future pydantic-ai version changes the re-extraction behavior, this would silently break — the meta-tools would mutate the original instance's state while prepare_tools reads from the clone's state, making skill loading appear to have no effect.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@DouweM
Copy link
Copy Markdown
Contributor Author

DouweM commented Apr 10, 2026

Originally posted by @DouweM in #133 comment (PR was recreated)

Audit vs prior art: Skills

Worth adding now:

  • agentskills.io tolerance: ignore unknown frontmatter keys instead of failing
  • unload_skill(name) tool to free context window
  • Fuzzy/word-boundary search in search_skills

Follow-up opportunities:

  • Remote skill registries (git-based, like VStorm)
  • Dependency resolution between skills
  • Resource/script separation (agentskills.io pattern)

@DouweM DouweM marked this pull request as draft April 10, 2026 15:13
@mtessar
Copy link
Copy Markdown

mtessar commented Apr 12, 2026

@DouweM i am excited to see skills being added. Thanks for working on it!

…212)

* Split skills module into package to match project conventions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Move skills tests to tests/_skills/ to match project conventions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@adtyavrdhn adtyavrdhn self-assigned this Apr 15, 2026
@dergachoff
Copy link
Copy Markdown

Do I understand correctly that dynamic loading/unloading skills breaks cache?

@DouweM
Copy link
Copy Markdown
Contributor Author

DouweM commented Apr 23, 2026

It's clear that the people want skills support :)

From discussing on Slack (join, thread), it's also clear that people mean different things when they say that: some are you really looking for full filesys/shell user-provided skills support, but many primarily want "programmatic skills" that are defined code-side (and could be loaded from a file or a DB, but not necessarily the user's own/sandbox FS).

Check out the thread Slack thread if you have opinions. We're meeting next week with a couple of champions from the Slack thread to make sure we build the right thing, and not get distracted by agentskills.io if we don't have to.

@DouweM DouweM added this to the 2026-05 milestone Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Skills capability

4 participants