Browser-resident AI agent. Lives in your Chromium side panel, uses your real session — not a headless cloud VM.
- Open-source alternative to OpenAI Operator and Claude Computer-use. Apache-2.0, no $200/month paywall, no waitlist, no cloud-side queue. Install the extension, point it at any provider you already pay for.
- Runs in your own browser, with your own sessions. Proprietary browser agents drive a remote Chromium against fresh, logged-out sessions — you exchange credentials, paste cookies, or do without authenticated tasks entirely. Browd runs the agent loop inside Chrome / Edge / Brave on your machine, against the sessions you're already logged into. GitHub, Gmail, LinkedIn, your dashboards — Browd sees them the same way you do.
- Bring your own keys, any provider. Models are configured per role (Planner / Navigator / Judge) and routed through OpenRouter or any OpenAI-compatible endpoint. Anthropic, Google, Meta, local — one extension, no vendor lock.
- Adapts the plan after every step, not once at the start. A replanner node decides continue-or-finish on each turn — if the page is different from what the planner expected, the plan is rewritten before the next tool call. Most browser-agent flows commit to a static plan and recover poorly when reality diverges.
You type something into the side panel — "apply to the first AI Engineer job on hh.ru with my resume", "open the page for the Shutterstock image of the dog", "check the LM Arena leaderboard for the top open-source model right now" — and a LangGraph.js Plan-and-Execute agent runs the task inside one of your own browser tabs, using whatever sessions you're already logged into. No headless replay, no cloud VM, no copy-paste of credentials.
Forked from Nanobrowser (Apache-2.0) and reshaped into a LangGraph.js Plan-and-Execute runtime:
- Unified LangGraph.js ReAct + replanner loop (default
agentMode='unified') - Plan-and-Execute StateGraph — planner emits structured
taskParameters(URLs / queries / names), each subgoal runs a focused ReAct step, replanner decides continue-or-finish - Tab isolation: agent works in its own
[Browd]-prefixed tab; user tabs visible as metadata only; cross-over only via the explicittake_over_user_tabaction - Coordinate clicking via grid-overlay screenshots, with a
hitl_click_atescape hatch forisTrusted=falseantibot walls - Untrusted-content wrap on every third-party page text reaching the LLM
- Provider-agnostic STT (Gemini / OpenRouter / Grok)
- Multi-provider Planner / Navigator / Judge routing via OpenRouter
Legacy Planner+Navigator pipeline is still selectable via Options → Agent Mode for fallback.
- Open the Browd listing on the Chrome Web Store.
- Click Add to Chrome (works in Chrome / Edge / Brave / Arc).
- Pin Browd to your toolbar, click the icon to open the side panel.
- Add your provider keys in Options → Models — any OpenAI-compatible endpoint works; OpenRouter is convenient for routing Anthropic / Google / Meta / local through one key.
- Download the latest zip from the releases page.
- Unzip it anywhere.
- Open the extensions page (
chrome://extensionsin Chrome/Edge/Brave). - Toggle Developer mode.
- Load unpacked → pick the unzipped folder.
- Pin Browd to your toolbar, click the icon to open the side panel.
- Add your provider keys in Options → Models — any OpenAI-compatible endpoint works; OpenRouter is convenient for routing Anthropic / Google / Meta / local through one key.
git clone https://github.com/wyddy7/browd
cd browd
pnpm install
pnpm buildThen load the dist/ directory as an unpacked extension (steps 3–7 above). pnpm dev runs a watch build; background and content-script changes still need an extension-card reload after each rebuild.
These are the constraints currently shipping. They are documented up front rather than buried, because the fixes are larger than the value:
- Token usage is high. A non-trivial multi-site task can run 400–700k input tokens under
visionMode='always'— each turn re-attaches a fresh screenshot at ~10–14k tokens, so usage compounds with turn count. The exact cost depends on your provider and model; Browd shows a live token ring so you can watch it as the task runs. - Hard
isTrusted=falseantibot walls. Any CDP / extension-driven click generatesisTrusted=falseMouseEvents. Hard-gated sites (LinkedIn/jobsfilters, some Cloudflare gates, Google Images result tiles) silently no-op those clicks. Browd detects the loop within three attempts and offershitl_click_at— the agent pauses and asks you to click the blocked element yourself, then continues. - Heavy-vision steps are slow. Processing a screenshot plus the state message can take 20–30 s per turn on vision-capable models. The side-panel TRACE row shows the in-flight LLM call with elapsed time, so you can see it working — but it is genuinely slow, not instant.
- Not a research tool. Browd is a browser-resident agent for concrete tasks on concrete pages, not a Deep Research / scraper substitute. For "synthesise information across N sites" Tavily + Playwright on the backend is typically cheaper and better. The positioning matters — using Browd for ten consecutive web searches is the expensive way to get a mediocre answer.
- Modal overlays on first page load. Cookie banners and sign-in prompts that cover fresh content can stall progress for a turn or two before the agent decides to dismiss them. The replanner usually recovers by trying a different URL or invoking a direct nav, but for first-impression sites it can add latency.
Local setup, repo layout, testing commands, and pull-request expectations live in CONTRIBUTING.md. The AI-agent contract for working in chrome-extension/src/background/agent/ is in CLAUDE.md.
Browd is derived from Nanobrowser, released under Apache-2.0. The Plan-and-Execute agent topology, LangGraph.js integration, and Chrome Web Store packaging path here diverge from upstream, but the inherited foundation — side-panel architecture, Planner+Navigator pipeline, untrusted-content wrap, and i18n scaffolding — is theirs.
The unified agent runtime uses LangGraph.js's createReactAgent and StateGraph. URL → markdown extraction goes through Jina Reader by default with a linkedom fallback.
Apache-2.0 — see LICENSE. Upstream copyright and license notices are preserved in this repository.
* The demo GIF at the top is sped up — the actual run took roughly three minutes. Agentic browser tasks are not instant; see Known limits for the honest performance picture.
