feat: Workspace toolkit + GitHubContextProvider (HITL, edit-via-PR) by ashpreetbedi · Pull Request #7683 · agno-agi/agno

ashpreetbedi · 2026-04-25T16:09:14Z

Summary

This PR ships two related primitives that release together as 2.6.2:

Workspace — a polished local-machine toolkit at libs/agno/agno/tools/workspace.py that gives agents read/list/search/write/edit/move/delete/shell access to a root directory tree, with destructive operations gated by Agno's built-in human-in-the-loop confirmation by default.
GitHubContextProvider — a context provider at libs/agno/agno/context/github/ that gives agents navigation + edit-via-PR access to a Git repository hosted on GitHub. Two-tool surface (query_<id> / update_<id>) with read/write sub-agents bound to different toolsets, mirroring DatabaseContextProvider.

They ship together because the update_<id> path of GitHubContextProvider builds on Workspace for file ops inside per-task worktrees.

1. Workspace toolkit

This closes the polish gap with the Claude Agent SDK tab on the homepage. The new homepage snippet for the Agno SDK tab now reads:

tools=[Workspace(
    ".",
    allowed=["read", "list", "search"],
    confirm=["write", "edit", "delete", "shell"],
)]

Mutually exclusive allowed= / confirm= partitions of short aliases — auto-pass vs. approval-required. Aliases translate internally to descriptive method names (read_file, write_file, …) so the LLM tool spec stays self-explanatory.

What's in it

8 method pairs (sync + async): read_file, list_files, search_content, write_file, edit_file, move_file, delete_file, run_command.
Line-numbered read_file output (cat -n style) — chunk reads preserve actual file line numbers so the agent can chain into edit_file precisely.
edit_file with replace_all=False — unique-or-fail by default; flip for renames.
Rich list_files entries ({path, type, size}) with optional recursive=True, max_depth=3 for tree-style exploration.
Atomic write_file (writes .tmp then os.replace).
run_command strips ANSI before tailing output (saves tokens on npm/pip/etc.).
Path-scoping enforcement via inherited Toolkit._check_path. Explicitly NOT a process sandbox — the docstring spells this out and points at Daytona for untrusted code.
Opt-in require_read_before_write=True blocks writes/edits/moves/deletes on existing files until the agent has read them this session. Catches the "agent hallucinated the file's contents" bug class.

Permission model

A name in allowed runs silently.
A name in confirm requires user approval (Agno's requires_confirmation_tools HITL — surfaces as approval cards in the AgentOS UI; pause/resume in code).
A name in neither isn't registered — the LLM doesn't see it.
A name in both raises ValueError.
Default (both None) = reads in allowed, writes in confirm.
Type guard: confirm=True or allowed="read" raise a clear TypeError instead of confusing alias errors.

Cookbook + docs reorg

cookbook/91_tools/workspace_tools/ — basic_usage.py, with_confirmation.py, README, TEST_LOG.
cookbook/99_docs/home/ → cookbook/99_docs/index/ (homepage SDK tabs preserved with git mv); if __name__ block dropped — the runnable path is now fastapi dev <file>.py.
cookbook/99_docs/first-agent/workbench.py — runnable copy of the new "Build Your First Agent" snippet (18 lines, Workspace(".") with default safe partition + enable_agentic_memory=True).

2. GitHubContextProvider

Read + write access to a GitHub repository cloned into a local working directory (typically a Docker volume). Mirrors DatabaseContextProvider's read/write split:

query_<id>(question) — natural-language reads against the checkout. Backed by a sub-agent with read-only Workspace + GitReadTools (log/diff/show/blame/branches).
update_<id>(instruction) — natural-language writes that end in a pull request. Backed by a sub-agent with full Workspace + GitWriteTools (status/add/commit/push/gh pr create/gh pr view), scoped to a per-session worktree.

What's in it

libs/agno/agno/context/github/provider.py — GitHubContextProvider with asetup() (clone or fetch+pull, idempotent), aclose() (best-effort worktree cleanup), status() (<repo>@<branch>:<sha>), and the read/write sub-agent split.
libs/agno/agno/context/github/tools.py — GitReadTools (5 read ops) and GitWriteTools (6 write/PR ops).
libs/agno/agno/context/github/__init__.py — exports the provider, the toolkits, and the default instructions.
libs/agno/tests/unit/context/test_github_provider.py — 32 tests, no network. Uses a local bare git repo as the fake remote and stubs gh via a shell script.
cookbook/12_context/12_github.py — read demo against agno-agi/agno (always runs); write demo opens a PR if GITHUB_WRITE_REPO is set.

Key behaviors

Worktree-per-task (Coda's pattern): each session gets its own <workdir>/worktrees/<task>/ worktree on a <prefix>/<task> branch. Parallel update_<id> calls in different sessions don't collide. Cached by run_context.session_id; ephemeral teardown when no session_id is propagated.
Branch-prefix safety: every git push and gh pr create validates the active branch matches <pr_branch_prefix>/*. Default prefix is agno. The agent cannot push to the default branch — that's the tripwire that keeps this safe to expose.
PAT auth: github_token kwarg or GITHUB_TOKEN env. Embedded into the clone URL so subsequent pushes inherit auth without further setup. gh calls receive the token via GH_TOKEN / GITHUB_TOKEN in the subprocess env.
gh CLI dependency: create_pull_request and pr_status shell out to gh. Missing gh is a clear runtime error, not a silent constructor failure.
mode=tools returns the read-only flat surface only (Workspace(allowed=READ_TOOLS) + GitReadTools). Writes need the per-session worktree, so they require mode=default (two-tool surface).

Out of scope (deferred)

Multi-repo per provider instance. One provider = one repo.
GitHub App / OAuth auth. PAT only.
Direct push to default branch (refused by branch-prefix safety).
Forking workflow. Assumes push access to the source repo.
Mid-task conflict resolution.
Updating Scout's contexts.py — caller-side, separate.
Updating docs/ — separate doc PR for 2.6.2.

Type of change

FileTools, ShellTools, and LocalFileSystemTools are untouched — no deprecation, no breaking change for existing users.

Checklist

Code complies with style guidelines
Ran format/validation scripts (./scripts/format.sh and ./scripts/validate.sh)
Self-review completed
Documentation updated (comments, docstrings)
Examples and guides: cookbook examples included for both Workspace and GitHubContextProvider; homepage + first-agent docs snippets updated (docs repo PR is separate)
Tested in clean environment (cookbook smoke runs against gpt-5.4 end-to-end)
Tests added/updated (61 new workspace tests + 32 new GitHub provider tests; existing FileTools / context tests still green)

Duplicate and AI-Generated PR Check

I have searched existing open pull requests and confirmed that no other PR already addresses this issue
If a similar PR exists, I have explained below why this PR is a better approach
Check if this PR was entirely AI-generated (by Copilot, Claude Code, Cursor, etc.)

Additional Notes

Verified end-to-end:

All workspace + context unit tests green (164 total: 61 workspace, 32 github, 71 context regression).
./scripts/format.sh and ./scripts/validate.sh clean for new files (pre-existing mypy errors in unrelated slack.py / drive.py / sql.py are out of scope).
cookbook/91_tools/workspace_tools/basic_usage.py ran end-to-end against gpt-5.4 — agent called read_file → write_file → list_files.
cookbook/91_tools/workspace_tools/with_confirmation.py ran end-to-end — read_file ran silent, edit_file paused, requirement.confirm() resumed cleanly, edit applied.

Out of scope (Workspace, captured in .context/workspace_tools_design.md for future sprints):

multi_edit — atomic batched edits to one file (Claude Code parity).
Background processes — run_command(background=True) + command_output(handle) + kill_command(handle).
Dynamic confirm predicate (callable instead of static list) — touches Agno's Function layer.
additional_roots=[...] — extend root scope to multiple dirs.
LSP integration (Mastra has it; heavyweight).
Lifecycle hooks (PreToolUse / PostToolUse / Stop / SessionStart / SessionEnd).

Docs repo: the matching changes to index.mdx (Agno SDK tab) and first-agent.mdx (Build Your First Agent guide) live in the docs repo and will be committed there separately.

Add AgentOS examples used in the docs welcome page, one per framework — Agno SDK, Claude Agent SDK, DSPy, LangGraph. Each file is self-contained and runnable via `python <file>.py`. Also add the required demo deps (claude-agent-sdk, langgraph, langchain-openai, dspy) to libs/agno/pyproject.toml so the demo venv can run all four. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Adds a polished local-machine toolkit alongside FileTools, ShellTools, and LocalFileSystemTools. Combines read/write/edit/delete/search/shell into one cohesive surface, sandboxed to a configurable base_dir, with destructive operations requiring user approval by default through Toolkit's existing requires_confirmation_tools mechanism. The constructor exposes mutually-exclusive allowed_tools (auto-pass) and confirm_tools (approval-required) lists. The snippet on the docs homepage now mirrors the Claude Agent SDK tab's polish (visible sandbox + visible permission story) in Agno-native form. - New: libs/agno/agno/tools/workspace.py (7 sync + 7 async methods) - New: libs/agno/tests/unit/tools/test_workspace.py (38 tests, all passing) - New: cookbook/91_tools/workspace_tools/ (basic_usage, with_confirmation, README, TEST_LOG) - Switched: cookbook/99_docs/home/agno_agent.py to WorkspaceTools - FileTools, ShellTools, LocalFileSystemTools untouched (no deprecation) Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

… init Snippet-readability pass on the toolkit landed in the previous commit: - Class: WorkspaceTools -> Workspace - First param: base_dir -> root (positional, so Workspace(".") works) - allowed_tools / confirm_tools now accept short aliases instead of full method names. The toolkit translates aliases -> method names internally, so the LLM tool spec keeps the descriptive names (read_file, write_file, list_files, ...) — only the developer-facing snippet is shortened. Aliases: read, list, search, write, edit, delete, shell. Method names (and signatures) are unchanged. Net effect on the homepage snippet: tools=[Workspace( ".", allowed_tools=["read", "list", "search"], confirm_tools=["write", "edit", "delete", "shell"], )] Tests updated; 41 unit tests green (added 3 covering positional root, default-cwd root, and full-name-rejected-as-alias). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

…st, recursive, move, atomic, ANSI, read-before-write) Tier 1 — behavior tightening (no API additions): - read_file output is now line-numbered (cat -n style). Numbers reflect actual file lines so the agent can chain into edit_file precisely. Chunked reads preserve correct numbering relative to the source file. - edit_file gains replace_all=False. Default behavior (unique-or-fail) is unchanged; replace_all=True replaces every occurrence and reports the count. Multi-match error message now mentions the flag. - list_files entries are now {path, type, size} dicts instead of bare paths so the LLM can decide what to read without a second call. - run_command strips ANSI escape sequences (color codes, cursor moves) from output before tailing — saves tokens on npm/pip/etc. CLI output. - read_file "file too long" hints now mention search_content as an alternative to start_line/end_line chunking. Tier 2 — small additions: - list_files gains recursive=False and max_depth=3 params. tree -L semantics: max_depth=1 returns only the immediate children of the search root. - New move_file / amove_file (alias "move"). Both src and dst sandbox-checked. Refuses to clobber existing dst unless overwrite=True. Added to WRITE_TOOLS. - write_file is now atomic — writes to <file>.tmp, then os.replace into place. A crash mid-write can't leave a partially-written target. - New opt-in require_read_before_write=False constructor flag. When True, blocks write/edit/move/delete on existing files until the agent has read them this session. Catches the "agent hallucinated the file's contents" bug class. Newly-created files skip the check. Tests: 41 → 59 (+18 new, ~6 updated for new output shapes). FileTools tests unchanged (no regressions). Homepage snippet unchanged at 4 confirm tools — `move` is documented in the toolkit README and discoverable, but not enabled by default in the demo. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

…ed agno_assist Reorganize cookbook/99_docs/ to mirror the docs site layout: - cookbook/99_docs/home/ → cookbook/99_docs/index/ (homepage SDK tabs) - new: cookbook/99_docs/first-agent/agno_assist.py — runnable copy of the snippet shown in docs/first-agent.mdx, switched from MCPTools to Workspace(".") and trimmed to 18 lines (no markdown=True). Defaults give the agent the full read/list/search/write/edit/move/delete/shell surface with safe-by-default confirmation on destructive ops. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

- Rename file: cookbook/99_docs/first-agent/agno_assist.py → workbench.py - Rename agent: "Agno Assist" → "Workbench" (pairs with Workspace as the thing that *works in* the workspace) - Add enable_agentic_memory=True so the agent can remember things across sessions, not just within the conversation history window. 19 lines total. Mirrors the updated docs/first-agent.mdx (in the docs repo). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

"Sandbox" implies OS-level isolation guarantees (process namespaces, syscall filtering, network blocking) that Workspace doesn't deliver. What it actually does is path-scoping: paths must resolve under root, shell commands run with cwd=root. The agent can still read env vars, hit the network via shell, and use anything else the host process can. Removed "sandbox/sandboxed" wording across the toolkit docstring, README, test comments, design doc, and the cookbook workbench file. Where the word remains, it now appears as an explicit disclaimer pointing to Daytona for real sandboxing. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Removes the trailing main-gate from all four index/ tab files. The runnable path is now 'fastapi dev <file>.py', matching the convention already used in cookbook/99_docs/first-agent/workbench.py and the docs first-agent guide. 3 fewer lines per file, snippet ends cleanly at 'app = agent_os.get_app()'. Mirrors the matching change in docs/index.mdx. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Drops the _tools suffix on the two partition kwargs: Workspace.allowed_tools=[...] → Workspace.allowed=[...] Workspace.confirm_tools=[...] → Workspace.confirm=[...] The strings inside the lists ("read", "write", ...) already make the meaning self-evident, and the shorter names save real estate on the homepage snippet where every character matters. Our partition semantics already differ from Claude SDK's allowed_tools (theirs = whitelist; ours = auto-pass subset mutually exclusive with confirm), so the rename also reduces a false-friend collision. Adds an isinstance(list) check in _resolve_partitions so confirm=True or allowed="read" raise a clear TypeError instead of a confusing alias error (e.g. "unknown alias 'r', 'e', 'a', 'd'" from set('read')). Sweep: workspace.py constructor + docstring + error messages, test_workspace (38 occurrences + 2 new TypeError tests, 61 tests total), README, basic_usage, TEST_LOG, design doc, and the homepage cookbook agno_agent.py. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Was bare 'set'; now Set[Path] for proper type checking. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Mirrors DatabaseContextProvider's read/write split with two sub-agents: - query_<id>: read-only Workspace + GitReadTools over the main checkout - update_<id>: full Workspace + GitWriteTools scoped to a per-session worktree at <workdir>/worktrees/<task>/ Every write task ends in a PR on a <prefix>/<task> branch the human reviews and merges. Branch-prefix safety on git_push and gh pr create keeps the agent from pushing to the default branch. - Auth: PAT via github_token kwarg or GITHUB_TOKEN env, embedded into the clone URL so subsequent pushes inherit it; gh receives it via GH_TOKEN/GITHUB_TOKEN in the subprocess env. - Worktree-per-task (Coda's pattern): parallel update_<id> calls in different sessions don't collide. Cached by run_context.session_id (synthetic ephemeral fallback for stateless callers). - gh CLI dependency: create_pull_request + pr_status shell out to gh. Missing gh is a clear runtime error, not a constructor failure. - mode=tools returns the read-only flat surface only; writes need the sub-agent split, so they require mode=default (two-tool surface). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

32 unit tests, no network. The fake remote is a bare git repo seeded with one commit on main; gh is stubbed via a shell script that echoes a fake PR URL. Coverage: - repo URL parsing (owner/name, https, .git suffix) - task name sanitization (path/branch-safe, length cap, uuid fallback) - asetup: fresh clone, idempotent, dirty-tree warning without failure - token sourcing: kwarg wins, env fallback, stays None when neither set - mode resolution: default returns query+update, tools returns read-only - worktree lifecycle: created on first update, reused per session_id, ephemeral teardown when session_id is absent, cleaned up by aclose - branch-prefix safety: git_push and create_pull_request both refuse non-prefixed branches - path-escape: GitWriteTools rejects task_workdir outside workdir - per-call author identity stamps Agno <[email protected]> without global git config - gh integration: create_pull_request returns the URL, pr_status returns parsed JSON, missing gh produces a clear error Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Single combined file (matches the per-provider numbered convention in 12_context/, e.g. 05_slack.py): the read prompt always runs against agno-agi/agno; the write prompt runs only when GITHUB_WRITE_REPO is set, so a casual python cookbook/12_context/12_github.py never opens a PR against a real repo. Also updates the cookbook README to list the new provider and demo. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

ashpreetbedi and others added 12 commits April 23, 2026 13:54

WIP: epitaxy pre-switch from feat/docs-snippets

0660bee

Merge branch 'main' into feat/docs-snippets

9584bec

chore: tighten Set[Path] type annotation on Workspace._read_paths

e0cec30

Was bare 'set'; now Set[Path] for proper type checking. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

ashpreetbedi requested a review from a team as a code owner April 25, 2026 16:09

ashpreetbedi and others added 3 commits April 26, 2026 10:56

ashpreetbedi changed the title ~~feat: add Workspace toolkit for local file + shell ops with HITL~~ feat: Workspace toolkit + GitHubContextProvider (HITL, edit-via-PR) Apr 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Workspace toolkit + GitHubContextProvider (HITL, edit-via-PR)#7683

feat: Workspace toolkit + GitHubContextProvider (HITL, edit-via-PR)#7683
ashpreetbedi wants to merge 15 commits intomainfrom
feat/workspace-tools

ashpreetbedi commented Apr 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ashpreetbedi commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

1. Workspace toolkit

What's in it

Permission model

Cookbook + docs reorg

2. GitHubContextProvider

What's in it

Key behaviors

Out of scope (deferred)

Type of change

Checklist

Duplicate and AI-Generated PR Check

Additional Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ashpreetbedi commented Apr 25, 2026 •

edited

Loading