Skip to content

Add FileSystem and Shell capabilities#177

Draft
DouweM wants to merge 3 commits intomainfrom
capability/filesystem-shell
Draft

Add FileSystem and Shell capabilities#177
DouweM wants to merge 3 commits intomainfrom
capability/filesystem-shell

Conversation

@DouweM
Copy link
Copy Markdown
Contributor

@DouweM DouweM commented Apr 10, 2026

Summary

  • Add FileSystem capability with read_file, write_file, edit_file, list_directory, and search_files tools, scoped to a configurable root directory with path traversal prevention and glob-based allow/deny filtering
  • Add Shell capability with run_command tool using anyio-based subprocess execution, timeout support, output truncation, and command allow/deny lists
  • Both are AbstractCapability subclasses that provide tools via get_toolset() using FunctionToolset
  • 66 tests with 100% code coverage across both capabilities

Closes #25, closes #26.

Test plan

  • All 66 tests pass (pytest)
  • 100% code coverage (coverage run -m pytest && coverage report)
  • Type checking passes (pyright strict mode, 0 errors)
  • Lint passes (ruff check and ruff format --check)
  • Both asyncio and trio backends tested for Shell

🤖 Generated with Claude Code

DouweM and others added 3 commits April 2, 2026 05:28
Implement two AbstractCapability subclasses for agent file system access
and shell command execution, with comprehensive configuration, security
controls, and 100% test coverage.

FileSystem provides read_file, write_file, edit_file, list_directory, and
search_files tools scoped to a configurable root directory with path
traversal prevention and glob-based allow/deny filtering.

Shell provides a run_command tool with anyio-based subprocess execution,
timeout support, output truncation, and command allow/deny lists.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rsist_cwd

FileSystem:
- Add create_directory(path) tool that creates directories with parents
- Add find_files(pattern, path) tool for glob-based file name search

Shell:
- Separate stdout and stderr in output with [stdout]/[stderr] labels
- Add persist_cwd parameter to track cd commands across calls

All new features have 100% test coverage (113 tests total).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…GC warnings

- Add `await proc.aclose()` in finally block to properly clean up subprocess
  resources after command execution
- Add pytest filterwarnings to suppress Python 3.10-specific asyncio subprocess
  transport warnings (ResourceWarning, PytestUnraisableExceptionWarning) that
  occur during GC even after proper cleanup. Fixed in CPython 3.11+.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 potential issues.

View 6 additional findings in Devin Review.

Open in Devin Review

Comment on lines +246 to +267
for file_path in files:
if not file_path.is_file():
continue
# Skip hidden files/directories
try:
rel_parts = file_path.relative_to(self._root).parts
except ValueError: # pragma: no cover
continue
if any(part.startswith('.') for part in rel_parts):
continue
try:
raw = file_path.read_bytes()
except OSError:
continue
# Skip binary files
if b'\x00' in raw[:8192]:
continue
text = raw.decode('utf-8', errors='replace')
rel_path = str(file_path.relative_to(self._root))
for line_num, line in enumerate(text.splitlines(), start=1):
if compiled.search(line):
results.append(f'{rel_path}:{line_num}:{line}')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 search_files, find_files, and list_directory bypass denied_patterns/allowed_patterns access controls

search_files, find_files, and list_directory only call check_access() on the top-level directory path argument (e.g., '.'), not on the individual files they iterate over. This means an agent can read the contents of files matching denied_patterns (or not matching allowed_patterns) by using search_files to grep through them, find_files to discover them, or list_directory to see them listed.

Concrete exploit with denied_patterns=['*.env']

With FileSystem(root_dir=tmp, denied_patterns=['*.env']), calling read_file('secret.env') is correctly blocked. However:

  • search_files('API_KEY') reads all files via rglob('*') at filesystem.py:244 without checking access, exposing the content of secret.env
  • find_files('*.env') lists secret.env via resolved.glob(pattern) at filesystem.py:302 without checking access
  • list_directory('.') reveals secret.env in the listing at filesystem.py:215 without filtering

Each file's relative path should be checked with check_access() before reading or listing.

Prompt for agents
The search_files, find_files, and list_directory methods all iterate over files without checking each individual file against the denied_patterns/allowed_patterns access controls. Only the top-level directory path is checked via safe_resolve/check_access.

To fix this, add a per-file check_access call in each method's iteration loop. For search_files (filesystem.py:246-267), before reading each file, compute the relative path and call self.check_access(rel_path), catching PermissionError to skip denied files. Apply the same pattern to find_files (filesystem.py:302-308) and list_directory (filesystem.py:215-224).

The check_access method already works with relative path strings and fnmatch, so the fix is straightforward: compute rel_path = str(file_path.relative_to(self._root)) and wrap the file processing in a try/except PermissionError block, or pre-check with check_access before processing.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +79 to +91
try:
tokens = shlex.split(command)
except ValueError:
# If shlex can't parse it, fall through and let the shell handle it
return
if not tokens:
return
executable = tokens[0]

if self.denied_commands and executable in self.denied_commands:
raise PermissionError(f'Command {executable!r} is denied.')
if self.allowed_commands and executable not in self.allowed_commands:
raise PermissionError(f'Command {executable!r} is not in the allowed list.')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 check_command only validates first token, allowing shell metacharacter bypass of allow/deny lists

check_command uses shlex.split() to tokenize the command and only checks the first token against allowed_commands/denied_commands. Since shlex.split does not recognize shell metacharacters (;, &&, ||, |) as command separators, an attacker can chain a denied command after an allowed one. For example, with denied_commands=['rm'], the command echo hello; rm -rf / has first token 'echo' (since shlex produces ['echo', 'hello;', 'rm', '-rf', '/']), bypassing the deny check while the shell executes both echo and rm.

Prompt for agents
The check_command method in shell.py:79-91 only validates the first token from shlex.split, but shlex does not split on shell metacharacters like ;, &&, ||, |, backticks, or $(). This means denied commands can be trivially executed by chaining them: 'echo hi; rm -rf /' passes the check because the first token is 'echo'.

Possible approaches to fix:
1. Before shlex parsing, split the command on shell metacharacters (;, &&, ||, |) and validate each sub-command's first token independently.
2. Use shlex with punctuation_chars=True to properly tokenize shell operators, then identify and validate each command segment.
3. Document this as a known limitation and note that allow/deny lists provide best-effort protection only against direct invocation, not shell-level command chaining.

Approach 1 or 2 would involve iterating over all command segments in the pipeline/chain and checking each executable against the allow/deny lists.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +133 to +134
elif target.startswith('~'):
new_cwd = Path.home() / target[2:] # skip "~/"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 _update_cwd incorrectly handles ~username paths by stripping first 2 chars instead of checking for ~/

In _update_cwd at shell.py:133-134, the elif target.startswith('~') branch handles all tilde paths that aren't exactly '~' by doing target[2:] (comment says "skip ~/"). This is only correct for paths starting with ~/. For a path like ~a, target[2:] yields '', causing Path.home() / '' = Path.home(), so cd ~a incorrectly changes to the home directory if it exists. For ~user/path, target[2:] yields er/path, producing a wrong path. The condition should be target.startswith('~/') to only handle home-relative paths.

Suggested change
elif target.startswith('~'):
new_cwd = Path.home() / target[2:] # skip "~/"
elif target.startswith('~/'):
new_cwd = Path.home() / target[2:] # skip "~/"
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@DouweM
Copy link
Copy Markdown
Contributor Author

DouweM commented Apr 10, 2026

Originally posted by @DouweM in #139 comment (PR was recreated)

Audit vs prior art: FileSystem + Shell

Worth adding now:

  • create_directory(path) tool
  • find_files(pattern, path) glob tool (distinct from content search)
  • Shell stderr separation (labeled stdout vs stderr in output)
  • Persistent working directory tracking for Shell

Follow-up opportunities:

@DouweM DouweM marked this pull request as draft April 10, 2026 15:12
@DEENUU1
Copy link
Copy Markdown

DEENUU1 commented Apr 11, 2026

I see that file editing now relies on text replacement. In pydantic-ai-backend, we also support hashline - https://github.com/vstorm-co/pydantic-ai-backend/blob/main/src/pydantic_ai_backends/hashline.py

Here's an article on the topic: https://blog.can.ac/2026/02/12/the-harness-problem/

@DouweM DouweM added this to the 2026-06 milestone Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Shell capability FileSystem capability

2 participants