feat: Workspace toolkit + GitHubContextProvider (HITL, edit-via-PR)#7683
Open
ashpreetbedi wants to merge 15 commits intomainfrom
Open
feat: Workspace toolkit + GitHubContextProvider (HITL, edit-via-PR)#7683ashpreetbedi wants to merge 15 commits intomainfrom
ashpreetbedi wants to merge 15 commits intomainfrom
Conversation
Add AgentOS examples used in the docs welcome page, one per framework — Agno SDK, Claude Agent SDK, DSPy, LangGraph. Each file is self-contained and runnable via `python <file>.py`. Also add the required demo deps (claude-agent-sdk, langgraph, langchain-openai, dspy) to libs/agno/pyproject.toml so the demo venv can run all four. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Adds a polished local-machine toolkit alongside FileTools, ShellTools, and LocalFileSystemTools. Combines read/write/edit/delete/search/shell into one cohesive surface, sandboxed to a configurable base_dir, with destructive operations requiring user approval by default through Toolkit's existing requires_confirmation_tools mechanism. The constructor exposes mutually-exclusive allowed_tools (auto-pass) and confirm_tools (approval-required) lists. The snippet on the docs homepage now mirrors the Claude Agent SDK tab's polish (visible sandbox + visible permission story) in Agno-native form. - New: libs/agno/agno/tools/workspace.py (7 sync + 7 async methods) - New: libs/agno/tests/unit/tools/test_workspace.py (38 tests, all passing) - New: cookbook/91_tools/workspace_tools/ (basic_usage, with_confirmation, README, TEST_LOG) - Switched: cookbook/99_docs/home/agno_agent.py to WorkspaceTools - FileTools, ShellTools, LocalFileSystemTools untouched (no deprecation) Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
… init
Snippet-readability pass on the toolkit landed in the previous commit:
- Class: WorkspaceTools -> Workspace
- First param: base_dir -> root (positional, so Workspace(".") works)
- allowed_tools / confirm_tools now accept short aliases instead of full
method names. The toolkit translates aliases -> method names internally,
so the LLM tool spec keeps the descriptive names (read_file, write_file,
list_files, ...) — only the developer-facing snippet is shortened.
Aliases: read, list, search, write, edit, delete, shell.
Method names (and signatures) are unchanged.
Net effect on the homepage snippet:
tools=[Workspace(
".",
allowed_tools=["read", "list", "search"],
confirm_tools=["write", "edit", "delete", "shell"],
)]
Tests updated; 41 unit tests green (added 3 covering positional root,
default-cwd root, and full-name-rejected-as-alias).
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…st, recursive, move, atomic, ANSI, read-before-write)
Tier 1 — behavior tightening (no API additions):
- read_file output is now line-numbered (cat -n style). Numbers reflect actual
file lines so the agent can chain into edit_file precisely. Chunked reads
preserve correct numbering relative to the source file.
- edit_file gains replace_all=False. Default behavior (unique-or-fail) is
unchanged; replace_all=True replaces every occurrence and reports the count.
Multi-match error message now mentions the flag.
- list_files entries are now {path, type, size} dicts instead of bare paths
so the LLM can decide what to read without a second call.
- run_command strips ANSI escape sequences (color codes, cursor moves) from
output before tailing — saves tokens on npm/pip/etc. CLI output.
- read_file "file too long" hints now mention search_content as an
alternative to start_line/end_line chunking.
Tier 2 — small additions:
- list_files gains recursive=False and max_depth=3 params. tree -L semantics:
max_depth=1 returns only the immediate children of the search root.
- New move_file / amove_file (alias "move"). Both src and dst sandbox-checked.
Refuses to clobber existing dst unless overwrite=True. Added to WRITE_TOOLS.
- write_file is now atomic — writes to <file>.tmp, then os.replace into place.
A crash mid-write can't leave a partially-written target.
- New opt-in require_read_before_write=False constructor flag. When True,
blocks write/edit/move/delete on existing files until the agent has read
them this session. Catches the "agent hallucinated the file's contents"
bug class. Newly-created files skip the check.
Tests: 41 → 59 (+18 new, ~6 updated for new output shapes). FileTools tests
unchanged (no regressions).
Homepage snippet unchanged at 4 confirm tools — `move` is documented in the
toolkit README and discoverable, but not enabled by default in the demo.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…ed agno_assist
Reorganize cookbook/99_docs/ to mirror the docs site layout:
- cookbook/99_docs/home/ → cookbook/99_docs/index/ (homepage SDK tabs)
- new: cookbook/99_docs/first-agent/agno_assist.py — runnable copy of the
snippet shown in docs/first-agent.mdx, switched from MCPTools to
Workspace(".") and trimmed to 18 lines (no markdown=True). Defaults give
the agent the full read/list/search/write/edit/move/delete/shell surface
with safe-by-default confirmation on destructive ops.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
- Rename file: cookbook/99_docs/first-agent/agno_assist.py → workbench.py - Rename agent: "Agno Assist" → "Workbench" (pairs with Workspace as the thing that *works in* the workspace) - Add enable_agentic_memory=True so the agent can remember things across sessions, not just within the conversation history window. 19 lines total. Mirrors the updated docs/first-agent.mdx (in the docs repo). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
"Sandbox" implies OS-level isolation guarantees (process namespaces, syscall filtering, network blocking) that Workspace doesn't deliver. What it actually does is path-scoping: paths must resolve under root, shell commands run with cwd=root. The agent can still read env vars, hit the network via shell, and use anything else the host process can. Removed "sandbox/sandboxed" wording across the toolkit docstring, README, test comments, design doc, and the cookbook workbench file. Where the word remains, it now appears as an explicit disclaimer pointing to Daytona for real sandboxing. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Removes the trailing main-gate from all four index/ tab files. The runnable path is now 'fastapi dev <file>.py', matching the convention already used in cookbook/99_docs/first-agent/workbench.py and the docs first-agent guide. 3 fewer lines per file, snippet ends cleanly at 'app = agent_os.get_app()'. Mirrors the matching change in docs/index.mdx. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Drops the _tools suffix on the two partition kwargs:
Workspace.allowed_tools=[...] → Workspace.allowed=[...]
Workspace.confirm_tools=[...] → Workspace.confirm=[...]
The strings inside the lists ("read", "write", ...) already make the meaning
self-evident, and the shorter names save real estate on the homepage snippet
where every character matters. Our partition semantics already differ from
Claude SDK's allowed_tools (theirs = whitelist; ours = auto-pass subset
mutually exclusive with confirm), so the rename also reduces a false-friend
collision.
Adds an isinstance(list) check in _resolve_partitions so confirm=True or
allowed="read" raise a clear TypeError instead of a confusing alias error
(e.g. "unknown alias 'r', 'e', 'a', 'd'" from set('read')).
Sweep: workspace.py constructor + docstring + error messages, test_workspace
(38 occurrences + 2 new TypeError tests, 61 tests total), README, basic_usage,
TEST_LOG, design doc, and the homepage cookbook agno_agent.py.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Was bare 'set'; now Set[Path] for proper type checking. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Mirrors DatabaseContextProvider's read/write split with two sub-agents: - query_<id>: read-only Workspace + GitReadTools over the main checkout - update_<id>: full Workspace + GitWriteTools scoped to a per-session worktree at <workdir>/worktrees/<task>/ Every write task ends in a PR on a <prefix>/<task> branch the human reviews and merges. Branch-prefix safety on git_push and gh pr create keeps the agent from pushing to the default branch. - Auth: PAT via github_token kwarg or GITHUB_TOKEN env, embedded into the clone URL so subsequent pushes inherit it; gh receives it via GH_TOKEN/GITHUB_TOKEN in the subprocess env. - Worktree-per-task (Coda's pattern): parallel update_<id> calls in different sessions don't collide. Cached by run_context.session_id (synthetic ephemeral fallback for stateless callers). - gh CLI dependency: create_pull_request + pr_status shell out to gh. Missing gh is a clear runtime error, not a constructor failure. - mode=tools returns the read-only flat surface only; writes need the sub-agent split, so they require mode=default (two-tool surface). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
32 unit tests, no network. The fake remote is a bare git repo seeded with one commit on main; gh is stubbed via a shell script that echoes a fake PR URL. Coverage: - repo URL parsing (owner/name, https, .git suffix) - task name sanitization (path/branch-safe, length cap, uuid fallback) - asetup: fresh clone, idempotent, dirty-tree warning without failure - token sourcing: kwarg wins, env fallback, stays None when neither set - mode resolution: default returns query+update, tools returns read-only - worktree lifecycle: created on first update, reused per session_id, ephemeral teardown when session_id is absent, cleaned up by aclose - branch-prefix safety: git_push and create_pull_request both refuse non-prefixed branches - path-escape: GitWriteTools rejects task_workdir outside workdir - per-call author identity stamps Agno <[email protected]> without global git config - gh integration: create_pull_request returns the URL, pr_status returns parsed JSON, missing gh produces a clear error Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Single combined file (matches the per-provider numbered convention in 12_context/, e.g. 05_slack.py): the read prompt always runs against agno-agi/agno; the write prompt runs only when GITHUB_WRITE_REPO is set, so a casual python cookbook/12_context/12_github.py never opens a PR against a real repo. Also updates the cookbook README to list the new provider and demo. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR ships two related primitives that release together as 2.6.2:
Workspace— a polished local-machine toolkit atlibs/agno/agno/tools/workspace.pythat gives agents read/list/search/write/edit/move/delete/shell access to arootdirectory tree, with destructive operations gated by Agno's built-in human-in-the-loop confirmation by default.GitHubContextProvider— a context provider atlibs/agno/agno/context/github/that gives agents navigation + edit-via-PR access to a Git repository hosted on GitHub. Two-tool surface (query_<id>/update_<id>) with read/write sub-agents bound to different toolsets, mirroringDatabaseContextProvider.They ship together because the
update_<id>path ofGitHubContextProviderbuilds onWorkspacefor file ops inside per-task worktrees.1. Workspace toolkit
This closes the polish gap with the Claude Agent SDK tab on the homepage. The new homepage snippet for the Agno SDK tab now reads:
Mutually exclusive
allowed=/confirm=partitions of short aliases — auto-pass vs. approval-required. Aliases translate internally to descriptive method names (read_file,write_file, …) so the LLM tool spec stays self-explanatory.What's in it
read_file,list_files,search_content,write_file,edit_file,move_file,delete_file,run_command.read_fileoutput (cat -nstyle) — chunk reads preserve actual file line numbers so the agent can chain intoedit_fileprecisely.edit_filewithreplace_all=False— unique-or-fail by default; flip for renames.list_filesentries ({path, type, size}) with optionalrecursive=True, max_depth=3for tree-style exploration.write_file(writes.tmpthenos.replace).run_commandstrips ANSI before tailing output (saves tokens onnpm/pip/etc.).Toolkit._check_path. Explicitly NOT a process sandbox — the docstring spells this out and points at Daytona for untrusted code.require_read_before_write=Trueblocks writes/edits/moves/deletes on existing files until the agent has read them this session. Catches the "agent hallucinated the file's contents" bug class.Permission model
allowedruns silently.confirmrequires user approval (Agno'srequires_confirmation_toolsHITL — surfaces as approval cards in the AgentOS UI; pause/resume in code).ValueError.None) = reads inallowed, writes inconfirm.confirm=Trueorallowed="read"raise a clearTypeErrorinstead of confusing alias errors.Cookbook + docs reorg
cookbook/91_tools/workspace_tools/—basic_usage.py,with_confirmation.py, README, TEST_LOG.cookbook/99_docs/home/→cookbook/99_docs/index/(homepage SDK tabs preserved withgit mv);if __name__block dropped — the runnable path is nowfastapi dev <file>.py.cookbook/99_docs/first-agent/workbench.py— runnable copy of the new "Build Your First Agent" snippet (18 lines,Workspace(".")with default safe partition +enable_agentic_memory=True).2. GitHubContextProvider
Read + write access to a GitHub repository cloned into a local working directory (typically a Docker volume). Mirrors
DatabaseContextProvider's read/write split:query_<id>(question)— natural-language reads against the checkout. Backed by a sub-agent with read-onlyWorkspace+GitReadTools(log/diff/show/blame/branches).update_<id>(instruction)— natural-language writes that end in a pull request. Backed by a sub-agent with fullWorkspace+GitWriteTools(status/add/commit/push/gh pr create/gh pr view), scoped to a per-session worktree.What's in it
libs/agno/agno/context/github/provider.py—GitHubContextProviderwithasetup()(clone or fetch+pull, idempotent),aclose()(best-effort worktree cleanup),status()(<repo>@<branch>:<sha>), and the read/write sub-agent split.libs/agno/agno/context/github/tools.py—GitReadTools(5 read ops) andGitWriteTools(6 write/PR ops).libs/agno/agno/context/github/__init__.py— exports the provider, the toolkits, and the default instructions.libs/agno/tests/unit/context/test_github_provider.py— 32 tests, no network. Uses a local bare git repo as the fake remote and stubsghvia a shell script.cookbook/12_context/12_github.py— read demo againstagno-agi/agno(always runs); write demo opens a PR ifGITHUB_WRITE_REPOis set.Key behaviors
<workdir>/worktrees/<task>/worktree on a<prefix>/<task>branch. Parallelupdate_<id>calls in different sessions don't collide. Cached byrun_context.session_id; ephemeral teardown when nosession_idis propagated.git pushandgh pr createvalidates the active branch matches<pr_branch_prefix>/*. Default prefix isagno. The agent cannot push to the default branch — that's the tripwire that keeps this safe to expose.github_tokenkwarg orGITHUB_TOKENenv. Embedded into the clone URL so subsequent pushes inherit auth without further setup.ghcalls receive the token viaGH_TOKEN/GITHUB_TOKENin the subprocess env.ghCLI dependency:create_pull_requestandpr_statusshell out togh. Missingghis a clear runtime error, not a silent constructor failure.mode=toolsreturns the read-only flat surface only (Workspace(allowed=READ_TOOLS)+GitReadTools). Writes need the per-session worktree, so they requiremode=default(two-tool surface).Out of scope (deferred)
contexts.py— caller-side, separate.docs/— separate doc PR for 2.6.2.Type of change
FileTools,ShellTools, andLocalFileSystemToolsare untouched — no deprecation, no breaking change for existing users.Checklist
./scripts/format.shand./scripts/validate.sh)gpt-5.4end-to-end)Duplicate and AI-Generated PR Check
Additional Notes
Verified end-to-end:
./scripts/format.shand./scripts/validate.shclean for new files (pre-existing mypy errors in unrelatedslack.py/drive.py/sql.pyare out of scope).cookbook/91_tools/workspace_tools/basic_usage.pyran end-to-end againstgpt-5.4— agent calledread_file → write_file → list_files.cookbook/91_tools/workspace_tools/with_confirmation.pyran end-to-end —read_fileran silent,edit_filepaused,requirement.confirm()resumed cleanly, edit applied.Out of scope (Workspace, captured in
.context/workspace_tools_design.mdfor future sprints):multi_edit— atomic batched edits to one file (Claude Code parity).run_command(background=True)+command_output(handle)+kill_command(handle).confirmpredicate (callable instead of static list) — touches Agno'sFunctionlayer.additional_roots=[...]— extend root scope to multiple dirs.Docs repo: the matching changes to
index.mdx(Agno SDK tab) andfirst-agent.mdx(Build Your First Agent guide) live in the docs repo and will be committed there separately.