Skip to content

wyddy7/browd

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

546 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Browd

Browser-resident AI agent. Lives in your Chromium side panel, uses your real session — not a headless cloud VM.

Chrome Web Store License

Browd opening the LM Arena leaderboard, filtering to open-source models, and returning the top three


Why Browd

  • Open-source alternative to OpenAI Operator and Claude Computer-use. Apache-2.0, no $200/month paywall, no waitlist, no cloud-side queue. Install the extension, point it at any provider you already pay for.
  • Runs in your own browser, with your own sessions. Proprietary browser agents drive a remote Chromium against fresh, logged-out sessions — you exchange credentials, paste cookies, or do without authenticated tasks entirely. Browd runs the agent loop inside Chrome / Edge / Brave on your machine, against the sessions you're already logged into. GitHub, Gmail, LinkedIn, your dashboards — Browd sees them the same way you do.
  • Bring your own keys, any provider. Models are configured per role (Planner / Navigator / Judge) and routed through OpenRouter or any OpenAI-compatible endpoint. Anthropic, Google, Meta, local — one extension, no vendor lock.
  • Adapts the plan after every step, not once at the start. A replanner node decides continue-or-finish on each turn — if the page is different from what the planner expected, the plan is rewritten before the next tool call. Most browser-agent flows commit to a static plan and recover poorly when reality diverges.

What Browd actually does

You type something into the side panel — "apply to the first AI Engineer job on hh.ru with my resume", "open the page for the Shutterstock image of the dog", "check the LM Arena leaderboard for the top open-source model right now" — and a LangGraph.js Plan-and-Execute agent runs the task inside one of your own browser tabs, using whatever sessions you're already logged into. No headless replay, no cloud VM, no copy-paste of credentials.

Forked from Nanobrowser (Apache-2.0) and reshaped into a LangGraph.js Plan-and-Execute runtime:

  • Unified LangGraph.js ReAct + replanner loop (default agentMode='unified')
  • Plan-and-Execute StateGraph — planner emits structured taskParameters (URLs / queries / names), each subgoal runs a focused ReAct step, replanner decides continue-or-finish
  • Tab isolation: agent works in its own [Browd]-prefixed tab; user tabs visible as metadata only; cross-over only via the explicit take_over_user_tab action
  • Coordinate clicking via grid-overlay screenshots, with a hitl_click_at escape hatch for isTrusted=false antibot walls
  • Untrusted-content wrap on every third-party page text reaching the LLM
  • Provider-agnostic STT (Gemini / OpenRouter / Grok)
  • Multi-provider Planner / Navigator / Judge routing via OpenRouter

Legacy Planner+Navigator pipeline is still selectable via Options → Agent Mode for fallback.

Install

From the Chrome Web Store (recommended)

  1. Open the Browd listing on the Chrome Web Store.
  2. Click Add to Chrome (works in Chrome / Edge / Brave / Arc).
  3. Pin Browd to your toolbar, click the icon to open the side panel.
  4. Add your provider keys in Options → Models — any OpenAI-compatible endpoint works; OpenRouter is convenient for routing Anthropic / Google / Meta / local through one key.

From a release (no build step)

  1. Download the latest zip from the releases page.
  2. Unzip it anywhere.
  3. Open the extensions page (chrome://extensions in Chrome/Edge/Brave).
  4. Toggle Developer mode.
  5. Load unpacked → pick the unzipped folder.
  6. Pin Browd to your toolbar, click the icon to open the side panel.
  7. Add your provider keys in Options → Models — any OpenAI-compatible endpoint works; OpenRouter is convenient for routing Anthropic / Google / Meta / local through one key.

From source (for development)

git clone https://github.com/wyddy7/browd
cd browd
pnpm install
pnpm build

Then load the dist/ directory as an unpacked extension (steps 3–7 above). pnpm dev runs a watch build; background and content-script changes still need an extension-card reload after each rebuild.

Known limits

These are the constraints currently shipping. They are documented up front rather than buried, because the fixes are larger than the value:

  • Token usage is high. A non-trivial multi-site task can run 400–700k input tokens under visionMode='always' — each turn re-attaches a fresh screenshot at ~10–14k tokens, so usage compounds with turn count. The exact cost depends on your provider and model; Browd shows a live token ring so you can watch it as the task runs.
  • Hard isTrusted=false antibot walls. Any CDP / extension-driven click generates isTrusted=false MouseEvents. Hard-gated sites (LinkedIn /jobs filters, some Cloudflare gates, Google Images result tiles) silently no-op those clicks. Browd detects the loop within three attempts and offers hitl_click_at — the agent pauses and asks you to click the blocked element yourself, then continues.
  • Heavy-vision steps are slow. Processing a screenshot plus the state message can take 20–30 s per turn on vision-capable models. The side-panel TRACE row shows the in-flight LLM call with elapsed time, so you can see it working — but it is genuinely slow, not instant.
  • Not a research tool. Browd is a browser-resident agent for concrete tasks on concrete pages, not a Deep Research / scraper substitute. For "synthesise information across N sites" Tavily + Playwright on the backend is typically cheaper and better. The positioning matters — using Browd for ten consecutive web searches is the expensive way to get a mediocre answer.
  • Modal overlays on first page load. Cookie banners and sign-in prompts that cover fresh content can stall progress for a turn or two before the agent decides to dismiss them. The replanner usually recovers by trying a different URL or invoking a direct nav, but for first-impression sites it can add latency.

Contributing

Local setup, repo layout, testing commands, and pull-request expectations live in CONTRIBUTING.md. The AI-agent contract for working in chrome-extension/src/background/agent/ is in CLAUDE.md.

Acknowledgments

Browd is derived from Nanobrowser, released under Apache-2.0. The Plan-and-Execute agent topology, LangGraph.js integration, and Chrome Web Store packaging path here diverge from upstream, but the inherited foundation — side-panel architecture, Planner+Navigator pipeline, untrusted-content wrap, and i18n scaffolding — is theirs.

The unified agent runtime uses LangGraph.js's createReactAgent and StateGraph. URL → markdown extraction goes through Jina Reader by default with a linkedom fallback.

License

Apache-2.0 — see LICENSE. Upstream copyright and license notices are preserved in this repository.


* The demo GIF at the top is sped up — the actual run took roughly three minutes. Agentic browser tasks are not instant; see Known limits for the honest performance picture.

About

Open-source browser AI agent extension. ReAct loop, your own key, runs locally in any Chromium browser.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors