Add FileSystem and Shell capabilities by DouweM · Pull Request #177 · pydantic/pydantic-ai-harness

DouweM · 2026-04-10T01:02:25Z

Summary

Add FileSystem capability with read_file, write_file, edit_file, list_directory, and search_files tools, scoped to a configurable root directory with path traversal prevention and glob-based allow/deny filtering
Add Shell capability with run_command tool using anyio-based subprocess execution, timeout support, output truncation, and command allow/deny lists
Both are AbstractCapability subclasses that provide tools via get_toolset() using FunctionToolset
66 tests with 100% code coverage across both capabilities

Closes #25, closes #26.

Test plan

All 66 tests pass (pytest)
100% code coverage (coverage run -m pytest && coverage report)
Type checking passes (pyright strict mode, 0 errors)
Lint passes (ruff check and ruff format --check)
Both asyncio and trio backends tested for Shell

🤖 Generated with Claude Code

Implement two AbstractCapability subclasses for agent file system access and shell command execution, with comprehensive configuration, security controls, and 100% test coverage. FileSystem provides read_file, write_file, edit_file, list_directory, and search_files tools scoped to a configurable root directory with path traversal prevention and glob-based allow/deny filtering. Shell provides a run_command tool with anyio-based subprocess execution, timeout support, output truncation, and command allow/deny lists. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…rsist_cwd FileSystem: - Add create_directory(path) tool that creates directories with parents - Add find_files(pattern, path) tool for glob-based file name search Shell: - Separate stdout and stderr in output with [stdout]/[stderr] labels - Add persist_cwd parameter to track cd commands across calls All new features have 100% test coverage (113 tests total). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…GC warnings - Add `await proc.aclose()` in finally block to properly clean up subprocess resources after command execution - Add pytest filterwarnings to suppress Python 3.10-specific asyncio subprocess transport warnings (ResourceWarning, PytestUnraisableExceptionWarning) that occur during GC even after proper cleanup. Fixed in CPython 3.11+. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

devin-ai-integration

Devin Review found 3 potential issues.

View 6 additional findings in Devin Review.

devin-ai-integration · 2026-04-10T01:09:23Z

+        for file_path in files:
+            if not file_path.is_file():
+                continue
+            # Skip hidden files/directories
+            try:
+                rel_parts = file_path.relative_to(self._root).parts
+            except ValueError:  # pragma: no cover
+                continue
+            if any(part.startswith('.') for part in rel_parts):
+                continue
+            try:
+                raw = file_path.read_bytes()
+            except OSError:
+                continue
+            # Skip binary files
+            if b'\x00' in raw[:8192]:
+                continue
+            text = raw.decode('utf-8', errors='replace')
+            rel_path = str(file_path.relative_to(self._root))
+            for line_num, line in enumerate(text.splitlines(), start=1):
+                if compiled.search(line):
+                    results.append(f'{rel_path}:{line_num}:{line}')


🔴 search_files, find_files, and list_directory bypass denied_patterns/allowed_patterns access controls

search_files, find_files, and list_directory only call check_access() on the top-level directory path argument (e.g., '.'), not on the individual files they iterate over. This means an agent can read the contents of files matching denied_patterns (or not matching allowed_patterns) by using search_files to grep through them, find_files to discover them, or list_directory to see them listed.

Concrete exploit with denied_patterns=['*.env']

With FileSystem(root_dir=tmp, denied_patterns=['*.env']), calling read_file('secret.env') is correctly blocked. However:

search_files('API_KEY') reads all files via rglob('*') at filesystem.py:244 without checking access, exposing the content of secret.env

find_files('*.env') lists secret.env via resolved.glob(pattern) at filesystem.py:302 without checking access

list_directory('.') reveals secret.env in the listing at filesystem.py:215 without filtering

Each file's relative path should be checked with check_access() before reading or listing.

Prompt for agents

The search_files, find_files, and list_directory methods all iterate over files without checking each individual file against the denied_patterns/allowed_patterns access controls. Only the top-level directory path is checked via safe_resolve/check_access. To fix this, add a per-file check_access call in each method's iteration loop. For search_files (filesystem.py:246-267), before reading each file, compute the relative path and call self.check_access(rel_path), catching PermissionError to skip denied files. Apply the same pattern to find_files (filesystem.py:302-308) and list_directory (filesystem.py:215-224). The check_access method already works with relative path strings and fnmatch, so the fix is straightforward: compute rel_path = str(file_path.relative_to(self._root)) and wrap the file processing in a try/except PermissionError block, or pre-check with check_access before processing.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-10T01:09:26Z

+        try:
+            tokens = shlex.split(command)
+        except ValueError:
+            # If shlex can't parse it, fall through and let the shell handle it
+            return
+        if not tokens:
+            return
+        executable = tokens[0]
+
+        if self.denied_commands and executable in self.denied_commands:
+            raise PermissionError(f'Command {executable!r} is denied.')
+        if self.allowed_commands and executable not in self.allowed_commands:
+            raise PermissionError(f'Command {executable!r} is not in the allowed list.')


🔴 check_command only validates first token, allowing shell metacharacter bypass of allow/deny lists

check_command uses shlex.split() to tokenize the command and only checks the first token against allowed_commands/denied_commands. Since shlex.split does not recognize shell metacharacters (;, &&, ||, |) as command separators, an attacker can chain a denied command after an allowed one. For example, with denied_commands=['rm'], the command echo hello; rm -rf / has first token 'echo' (since shlex produces ['echo', 'hello;', 'rm', '-rf', '/']), bypassing the deny check while the shell executes both echo and rm.

Prompt for agents

The check_command method in shell.py:79-91 only validates the first token from shlex.split, but shlex does not split on shell metacharacters like ;, &&, ||, |, backticks, or $(). This means denied commands can be trivially executed by chaining them: 'echo hi; rm -rf /' passes the check because the first token is 'echo'. Possible approaches to fix: 1. Before shlex parsing, split the command on shell metacharacters (;, &&, ||, |) and validate each sub-command's first token independently. 2. Use shlex with punctuation_chars=True to properly tokenize shell operators, then identify and validate each command segment. 3. Document this as a known limitation and note that allow/deny lists provide best-effort protection only against direct invocation, not shell-level command chaining. Approach 1 or 2 would involve iterating over all command segments in the pipeline/chain and checking each executable against the allow/deny lists.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-04-10T01:09:32Z

+        elif target.startswith('~'):
+            new_cwd = Path.home() / target[2:]  # skip "~/"


🟡 _update_cwd incorrectly handles ~username paths by stripping first 2 chars instead of checking for ~/

In _update_cwd at shell.py:133-134, the elif target.startswith('~') branch handles all tilde paths that aren't exactly '~' by doing target[2:] (comment says "skip ~/"). This is only correct for paths starting with ~/. For a path like ~a, target[2:] yields '', causing Path.home() / '' = Path.home(), so cd ~a incorrectly changes to the home directory if it exists. For ~user/path, target[2:] yields er/path, producing a wrong path. The condition should be target.startswith('~/') to only handle home-relative paths.

Suggested change

elif target.startswith('~'):

new_cwd = Path.home() / target[2:] # skip "~/"

elif target.startswith('~/'):

new_cwd = Path.home() / target[2:] # skip "~/"

Was this helpful? React with 👍 or 👎 to provide feedback.

DouweM · 2026-04-10T15:07:16Z

Originally posted by @DouweM in #139 comment (PR was recreated)

Audit vs prior art: FileSystem + Shell

Worth adding now:

create_directory(path) tool
find_files(pattern, path) glob tool (distinct from content search)
Shell stderr separation (labeled stdout vs stderr in output)
Persistent working directory tracking for Shell

Follow-up opportunities:

Docker/sandbox support (Environment abstraction (protocol-based file/shell/resource ops) #52 Environment), file watching

DEENUU1 · 2026-04-11T10:21:25Z

I see that file editing now relies on text replacement. In pydantic-ai-backend, we also support hashline - https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/hashline.py

Here's an article on the topic: https://blog.can.ac/2026/02/12/the-harness-problem/

DouweM and others added 3 commits April 2, 2026 05:28

DouweM requested review from Kludex, adtyavrdhn, dmontagu, dsfaccini and samuelcolvin as code owners April 10, 2026 01:02

devin-ai-integration Bot reviewed Apr 10, 2026

View reviewed changes

DouweM removed request for Kludex, adtyavrdhn, dmontagu, dsfaccini and samuelcolvin April 10, 2026 15:12

DouweM marked this pull request as draft April 10, 2026 15:12

DouweM added this to the 2026-06 milestone Apr 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add FileSystem and Shell capabilities#177

Add FileSystem and Shell capabilities#177
DouweM wants to merge 3 commits intomainfrom
capability/filesystem-shell

DouweM commented Apr 10, 2026 •

edited

Loading

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot Apr 10, 2026

Uh oh!

devin-ai-integration Bot Apr 10, 2026

Uh oh!

devin-ai-integration Bot Apr 10, 2026

Uh oh!

DouweM commented Apr 10, 2026

Uh oh!

DEENUU1 commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		elif target.startswith('~'):
		new_cwd = Path.home() / target[2:] # skip "~/"

Conversation

DouweM commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

DouweM commented Apr 10, 2026

Audit vs prior art: FileSystem + Shell

Uh oh!

DEENUU1 commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DouweM commented Apr 10, 2026 •

edited

Loading