Skip to content

Releases: sh3ll3x3c/native-devtools-mcp

v0.10.1

31 May 09:44
eedc38a

Choose a tag to compare

Security

  • rustls-webpki 0.103.12 → 0.103.13 — fixes a denial of service via panic on a malformed CRL BIT STRING (GHSA-82j2-j2ch-gfr8, high). Reached at runtime through rustls (TLS used by ureq and adb_client).
  • rmcp 0.2.1 → 1.7.0 — clears the Streamable HTTP server transport DNS-rebinding advisory (GHSA-89vp-x53w-74fx, high) and moves onto the supported 1.x line. This server uses only the stdio transport, so the vulnerable path was never reachable, but the upgrade resolves the alert. Internal API migration only — the tool surface is unchanged (same 48 tools, same protocol version).
  • rand 0.8.5 → 0.8.6 — resolves an unsoundness when using a custom logger with rand::rng() (GHSA-cq8v-f236-94qc, low).

v0.10.0

02 May 10:48
23bd8c7

Choose a tag to compare

Full Changelog: v0.9.3...v0.10.0

v0.9.3

23 Apr 21:05
89458a9

Choose a tag to compare

CDP

  • Collapsed the CDP snapshot surface to DOM-only. Removed cdp_take_ax_snapshot and the paired a<N> UID namespace. The native macOS take_ax_snapshot (still a<N>) and browser-side cdp_take_dom_snapshot / cdp_find_elements (always d<N>) now split cleanly — no more overlapping "which snapshot do I take?" for CDP. Breaking: existing callers that invoked cdp_take_ax_snapshot must switch to cdp_find_elements (for targeted lookups) or cdp_take_dom_snapshot (for the full page).
  • Parallelized DOM.describeNode resolution. cdp_take_dom_snapshot and cdp_find_elements previously did three sequential CDP round trips per element (get ref, describe, release) — for 500 elements that was ~1500 serial round trips. The per-element chain now runs through futures::join_all, pipelining over the single CDP WebSocket.
  • include_snapshot auto-appends capped at 100 nodes. cdp_click / cdp_hover / cdp_fill / cdp_press_key with include_snapshot=true previously appended a full 500-node DOM snapshot; they now append a 100-node snapshot. The user-facing cdp_take_dom_snapshot(max_nodes=500) default is unchanged.
  • cdp_wait_for snapshot is now opt-in. Added an include_snapshot flag (default false). On success the response is now a one-line "Text appeared after Xms: [...]" header unless include_snapshot=true, in which case a 100-node DOM snapshot is appended after the header. Breaking: callers that relied on cdp_wait_for implicitly returning a snapshot must pass include_snapshot=true.
  • cdp_element_at_point description corrected. Now accurately documents that the tool always returns backend_node_id and only carries a d-prefixed UID / role / name when the current DOM snapshot already contains the hit-tested node.

v0.9.2

22 Apr 20:12
cebc9f3

Choose a tag to compare

macOS

  • launch_app gains a background flag. When true, the app is launched via open -g -a, so it starts without being brought to the foreground. Useful when the next step uses CDP or AX dispatch (both focus-preserving) and you don't want the target window stealing focus. Default is false; Windows ignores the flag.

CDP

  • Label fallback prefers the element's own text nodes. The v0.9.1 DOM walker still concatenated sibling descendant text when those descendants had no aria/title/alt/role hints, producing composite labels like "Note to Self 1 week Verified" on wrapper buttons. getLabel() now first concatenates only the element's direct Text-node children and returns immediately on a non-empty result; the prior recursive walk remains as a secondary fallback for wrappers whose visible text lives inside an inner span. Elements with role or data-testid are also treated as self-contained semantic units so the recursive fallback no longer swallows badge text.

v0.9.1

21 Apr 16:37
98fc0c8

Choose a tag to compare

CDP

  • DOM walker no longer returns composite labels. getLabel() previously fell through to el.textContent when an element had no aria-label / aria-labelledby / title / alt, concatenating all descendant text. A header button wrapping avatar + chat name + badges produced labels like "Note to Self1 weekVerified", which misled agents into clicking the wrong element. Replaced with a direct-text collector that walks only direct text nodes plus descendant subtrees that do not carry their own label and are not themselves interactive; falls back to the tag name when no direct text exists.
  • DOM snapshot now renders parent context. Each line shows (in <role> "<name>") at the end, using the parentRole / parentName already captured by the walker. Lets a reader disambiguate, for example, a sidebar list item from a chat-header button that would otherwise print the same label.

v0.9.0

19 Apr 20:43
d54c2b8

Choose a tag to compare

Element-precise AX dispatch (macOS)

Three macOS-only tools that dispatch against accessibility-tree elements by uid, without moving the cursor or stealing focus. Complement — not replace — coordinate-based click / type_text.

  • ax_click — press a button, menu item, checkbox, or toolbar item by AX uid via AXPress.
  • ax_set_value — write to a text field's kAXValueAttribute. Value assignment, not keystroke typing: no keydown/keyup, no IME composition, no undo-stack entry. Fall back to click + type_text when key-event semantics are required.
  • ax_select — select a row inside NSOutlineView / NSTableView by writing AXSelectedRows on the enclosing outline/table. Use for sidebars (System Settings, Mail, Xcode, Finder) and rule lists where rows refuse AXPress.

All three return { ok, dispatched_via, bbox } on success; on failure, a typed error (snapshot_expired, uid_not_found, not_dispatchable, no_row_ancestor, no_outline_container, ax_error) with an optional fallback: {x, y} coordinate for coordinate-based retry.

Session-stateful take_ax_snapshot (macOS)

take_ax_snapshot on macOS is now session-backed: each call bumps a monotonic generation and emits uids as a<N>g<gen> (e.g. a42g3). Uids from prior snapshots are rejected by ax_click / ax_set_value / ax_select with snapshot_expired, eliminating the silent wrong-element-clicked failure mode. Snapshot immediately before each dispatch; every branch or retry starts with a fresh snapshot. Windows behavior is unchanged — bare a<N> uids, no session.

MCP tool metadata

  • ToolAnnotations on every toolreadOnlyHint, destructiveHint, idempotentHint, openWorldHint safety hints let MCP clients surface the right permission prompts and defaults.
  • click coordinate variants are mutually exclusive — schema uses oneOf (screen / window / screenshot), enforced at runtime. Mixing variants now produces a clear validation error instead of silent coordinate misinterpretation.
  • focus_window returns structured JSON ({ app_name, pid, kind }) instead of free-form text.

CDP

  • CDP tools are listed unconditionally. Previously they appeared only after cdp_connect; they now appear at session start and return a stable "not connected" error until connected, so callers can discover the API up front.

Dependencies

  • rmcp bumped to 0.2 to unlock ToolAnnotations.
  • rand and rustls-webpki bumped for low-severity advisories.

v0.8.0

02 Apr 19:39
983d2a0

Choose a tag to compare

New tools

  • cdp_element_at_point — resolve the CDP accessibility snapshot UID of the DOM element at given screen coordinates. Returns the element's UID, role, name, and backend_node_id. Bridges native screen coordinates with CDP's DOM model.
  • probe_app — classify an app's automation capabilities (native AX, CDP debug port, embedded debug server) to help agents pick the right tool strategy.

Fixes

  • Screen recorder — add Drop cleanup and reduce default max_duration from 5 minutes to 1 minute to prevent runaway recordings.
  • cdp_element_at_point — validate coordinates and check URL staleness before snapshot lookup to avoid stale results.

v0.7.1

22 Mar 08:28
a99e023

Choose a tag to compare

Windows fixes

  • Implement take_ax_snapshot on Windows — added collect_uia_tree using UI Automation, enabling accessibility tree snapshots on Windows (previously macOS-only)
  • Map all 41 UIA control typestake_ax_snapshot now correctly identifies all standard Windows control types (buttons, tabs, menus, data grids, semantic elements, etc.) instead of falling back to "Unknown"

Other

  • Shorten server.json description to meet MCP registry 100-char limit

v0.7.0

21 Mar 22:47
d3e5e92

Choose a tag to compare

Chrome DevTools Protocol support

native-devtools-mcp now supports the Chrome DevTools Protocol (CDP) — the same protocol that powers Puppeteer, Playwright, and chrome-devtools-mcp. Connect to any Chrome, Chromium, or Electron app and automate it with 16 new tools, all from a single native binary with zero Node.js dependencies.

This means you can now automate Chrome browsers and Electron apps (Signal, Discord, VS Code, Slack) with DOM-level precision — clicking elements by accessibility UID, filling forms, navigating pages, and evaluating JavaScript — alongside the existing native desktop and Android automation.

16 new cdp_* tools

  • cdp_connect / cdp_disconnect — connect to a running Chrome/Electron instance on a given port
  • cdp_take_snapshot — accessibility tree snapshot of the browser page (element UIDs, roles, names)
  • cdp_evaluate_script — evaluate JavaScript in the page, with optional element references from the snapshot
  • cdp_click — click a DOM element by UID (scroll-into-view, more reliable than screen coordinates for web content)
  • cdp_hover — hover over a DOM element by UID
  • cdp_fill — type text into an input/textarea or select an option from a <select> element
  • cdp_press_key — press a key or key combination (e.g., Enter, Control+A, Control+Shift+R)
  • cdp_type_text — character-by-character keyboard input into a focused element, with optional submit key
  • cdp_handle_dialog — accept or dismiss JavaScript dialogs (alert, confirm, prompt)
  • cdp_navigate — navigate to a URL, or go back/forward/reload (configurable timeout, handles slow-loading pages)
  • cdp_new_page — create a new browser tab and navigate to a URL
  • cdp_close_page — close a browser tab by index
  • cdp_wait_for — wait for any of multiple texts to appear on the page (lightweight JS polling with timeout)
  • cdp_list_pages / cdp_select_page — tab management

cdp_click, cdp_hover, cdp_fill, and cdp_press_key support include_snapshot to return a fresh snapshot with the action result, saving a round-trip.

Getting started

# Launch Chrome with remote debugging
launch_app(app_name="Google Chrome", args=["--remote-debugging-port=9222", "--user-data-dir=/tmp/chrome-profile"])

# Connect and automate
cdp_connect(port=9222)
cdp_navigate(url="https://example.com")
cdp_take_snapshot()
cdp_fill(uid="10", value="search query")
cdp_press_key(key="Enter")

Chrome 136+ requires --user-data-dir alongside --remote-debugging-port. Electron apps only need --remote-debugging-port.

Accessibility tree snapshot

New take_ax_snapshot tool that serializes the full macOS Accessibility (AX) tree into a structured text format with unique element IDs, roles, and names. Works for any app without requiring a debug port.

v0.6.0

21 Mar 12:11
51cff12

Choose a tag to compare

Screen recording

New start_recording / stop_recording tools for capturing screen activity as MP4 video. Useful for recording UI flows, repro steps, and demo clips.

  • Configurable FPS (default 5), region cropping, and max duration
  • Supported on macOS (CGWindowListCreateImage) and Windows (BitBlt)

Windows feature parity

Windows now supports all tools that were previously macOS-only:

  • Hover trackingstart_hover_tracking / get_hover_events / stop_hover_tracking via UI Automation and GetCursorPos
  • Screen recordingstart_recording / stop_recording via BitBlt capture loop
  • element_at_point — added app_name scoping and container fallback
  • find_text — now searches UIA Value and HelpText properties in addition to Name
  • get_cursor_position — new Windows implementation via GetCursorPos

Fixes

  • Drag pre-move cursor — cursor now moves to the start position before initiating a drag, ensuring correct start coordinates (Windows)
  • Hover dwell accuracy — fixed dwell time calculation to use arrival/departure timestamps correctly, preventing inflated dwell values from pass-through elements
  • Frontmost app detection — fixed macOS frontmost app resolution to use CGWindowList stacking order instead of NSWorkspace

Other

  • Windows code refactored: deduplicated PID resolution, extracted text property helper, simplified capture_window_jpeg and UIA element search
  • Updated rustls-webpki to 0.103.10 (CVE fix)