Skip to content

feat(server): add wait and custom code tools#980

Closed
Nikhil (shadowfax92) wants to merge 1 commit into
devfrom
polecat/topaz/bosmain-6fd@moxnj9op
Closed

feat(server): add wait and custom code tools#980
Nikhil (shadowfax92) wants to merge 1 commit into
devfrom
polecat/topaz/bosmain-6fd@moxnj9op

Conversation

@shadowfax92
Copy link
Copy Markdown
Contributor

Fixes #422

Verification:

  • Passed: bunx biome check apps/server/src/agent/prompt.ts apps/server/src/browser/browser.ts apps/server/src/tools/navigation.ts apps/server/src/tools/registry.ts apps/server/src/tools/snapshot.ts apps/server/src/tools/tool-label-registry.ts apps/server/tests/tools/navigation.test.ts apps/server/tests/tools/observation.test.ts apps/server/tests/tools/registry.test.ts
  • Local blocked: bun --env-file=.env.development test ./tests/tools/registry.test.ts cannot resolve @browseros/shared/constants/limits from server tests; this reproduces from origin/dev imports and is tracked as bosmain-bjx.
  • Local blocked: bun run typecheck cannot find tsc; tracked as bosmain-woi.
  • Local blocked: bun scripts/build/server.ts --target=darwin-arm64 --ci cannot resolve @smithy/core/endpoints; tracked as bosmain-vbu.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 9, 2026

✅ Tests passed — 1232/1236

Suite Passed Failed Skipped
agent 80/80 0 0
build 9/9 0 0
eval 93/93 0 0
server-agent 261/261 0 0
server-api 203/203 0 0
server-browser 4/4 0 0
server-integration 9/10 0 1
server-lib 242/242 0 0
server-root 60/63 0 3
server-skills 31/31 0 0
server-tools 240/240 0 0

View workflow run

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 9, 2026

Greptile Summary

This PR adds two new agent tools — wait_for (re-enabled and extended with textGone, selectorGone, and a fixed time pause) and browser_run_code (arbitrary async JS execution in the page context) — along with Browser.runCode, updated prompt instructions, and comprehensive unit + integration tests.

  • wait_for gains four new conditions (textGone, selectorGone, time) and a timing fix that prevents overshooting the deadline; the time-only path bypasses the browser entirely.
  • browser_run_code wraps a new Browser.runCode method that evaluates an async (args) => { … } expression via the CDP Runtime API, serialising args through JSON.stringify.
  • Both tools are registered in the registry and the agent prompt, with display labels added to tool-label-registry.ts.

Confidence Score: 3/5

The two new tools are broadly correct, but the wait_for handler has a logic gap that silently discards the time field whenever it is combined with any other condition, which could confuse the LLM agent driving the browser.

The wait_for handler accepts time alongside text/textGone/selector/selectorGone without error, then silently ignores it — an LLM caller following the schema will get no feedback that its time constraint was dropped. This is an observable wrong-result path on a core navigation tool that is now being re-enabled in production. The selectorGone false-positive on pages that never rendered the target element is a secondary concern worth addressing before the tool is widely used.

packages/browseros-agent/apps/server/src/tools/navigation.ts and packages/browseros-agent/apps/server/src/browser/browser.ts warrant a closer look before merging.

Important Files Changed

Filename Overview
packages/browseros-agent/apps/server/src/tools/navigation.ts Extended wait_for with textGone, selectorGone, and time; time is silently discarded when combined with other conditions, and the target field misreports which condition triggered a match when multiple are provided.
packages/browseros-agent/apps/server/src/browser/browser.ts Added textGone/selectorGone support to waitFor and a new runCode method; selectorGone returns true immediately when the selector never existed on the page, which may give false positives.
packages/browseros-agent/apps/server/src/tools/snapshot.ts Added browser_run_code tool wrapping the new runCode browser method; implementation is clean and consistent with evaluate_script.
packages/browseros-agent/apps/server/src/tools/registry.ts Re-enabled wait_for (was temporarily disabled) and registered browser_run_code; counts updated correctly.
packages/browseros-agent/apps/server/src/agent/prompt.ts Added browser_run_code and wait_for to the agent prompt, and corrected a stale wait_for call signature example.
packages/browseros-agent/apps/server/src/tools/tool-label-registry.ts Added display labels for wait_for and browser_run_code; refactored humanizeToolName to use destructuring — equivalent behaviour.
packages/browseros-agent/apps/server/tests/tools/navigation.test.ts Added integration tests for textGone and fixed-time wait; coverage is solid for the happy path.
packages/browseros-agent/apps/server/tests/tools/observation.test.ts Added integration tests for browser_run_code including async code and error propagation.
packages/browseros-agent/apps/server/tests/tools/registry.test.ts New unit test file covering tool registration, fixed-delay path, disappearance conditions, and browser_run_code; well-structured with mock browser context.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[wait_for called] --> B{any condition\nprovided?}
    B -- no --> ERR1[error: provide a condition]
    B -- yes --> C{time provided AND\nother condition?}
    C -- yes --> SILENTDROP["⚠️ time silently dropped\n(bug: should error)"]
    SILENTDROP --> F
    C -- no --> D{time only?}
    D -- yes --> E[setTimeout for time ms\nreturn found:true]
    D -- no --> F[browser.waitFor loop]
    F --> G{poll: text / textGone\nselector / selectorGone}
    G -- any condition true --> H[return true]
    G -- deadline exceeded --> I[return false]
    H --> J{found?}
    J -- yes --> K[response.text + snapshot]
    J -- no --> L[response.error timeout]

    M[browser_run_code called] --> N["wrap code in\nasync(args)=>{ code }"]
    N --> O[CDP Runtime.evaluate\nawaitPromise:true]
    O --> P{exceptionDetails?}
    P -- yes --> Q[response.error]
    P -- no --> R[response.text + data]
Loading
Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
packages/browseros-agent/apps/server/src/tools/navigation.ts:349-368
When `time` is provided alongside any of `text`, `textGone`, `selector`, or `selectorGone`, the fixed-wait branch is skipped and `time` is silently discarded — only the timeout-based wait runs. There is no validation or feedback to the caller. An LLM agent that passes `{ time: 2000, text: "hello" }` (intending a minimum-pause) will have the 2 000 ms wait entirely ignored and the tool will proceed straight to the `waitFor` loop.

```suggestion
    if (
      !args.text &&
      !args.textGone &&
      !args.selector &&
      !args.selectorGone &&
      args.time === undefined
    ) {
      response.error(
        'Provide text, textGone, selector, selectorGone, or time to wait for.',
      )
      return
    }

    if (
      args.time !== undefined &&
      (args.text || args.textGone || args.selector || args.selectorGone)
    ) {
      response.error(
        'time cannot be combined with text, textGone, selector, or selectorGone. Use time alone for a fixed wait.',
      )
      return
    }

    if (
      args.time !== undefined &&
      !args.text &&
      !args.textGone &&
      !args.selector &&
      !args.selectorGone
    ) {
```

### Issue 2 of 3
packages/browseros-agent/apps/server/src/tools/navigation.ts:336-347
**Misleading `target` when multiple conditions are provided**

When multiple conditions are supplied (e.g., `textGone: "Loading"` and `selectorGone: ".spinner"`), `target` is resolved via a priority chain that always picks the first non-undefined field — even if a lower-priority field triggered the actual match. If `.spinner` disappears first but `textGone` is the higher-priority field, the response says `target: 'text "Loading" to disappear'`, which misreports what was actually matched. The success message sent to the agent is equally misleading ("Found text 'Loading' to disappear on page" when it was the selector that disappeared).

### Issue 3 of 3
packages/browseros-agent/apps/server/src/browser/browser.ts:697-703
**`selectorGone` returns `true` immediately on a page where the selector never existed**

The expression `!document.querySelector(selector)` returns `true` whenever the selector is absent, including on blank pages or pages that never had the element. On a freshly navigated page that hasn't yet injected a loading spinner, `selectorGone: ".spinner"` would resolve to `true` on the very first poll rather than waiting for the element to appear and then disappear. If the intent is to confirm that a previously-visible element has been removed, callers relying on `selectorGone` as "wait until element appears then disappears" will get a misleading success on pages that never rendered the element at all.

Reviews (1): Last reviewed commit: "feat: add wait and custom code browser t..." | Re-trigger Greptile

Comment on lines +349 to +368
if (
!args.text &&
!args.textGone &&
!args.selector &&
!args.selectorGone &&
args.time === undefined
) {
response.error(
'Provide text, textGone, selector, selectorGone, or time to wait for.',
)
return
}

if (
args.time !== undefined &&
!args.text &&
!args.textGone &&
!args.selector &&
!args.selectorGone
) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 When time is provided alongside any of text, textGone, selector, or selectorGone, the fixed-wait branch is skipped and time is silently discarded — only the timeout-based wait runs. There is no validation or feedback to the caller. An LLM agent that passes { time: 2000, text: "hello" } (intending a minimum-pause) will have the 2 000 ms wait entirely ignored and the tool will proceed straight to the waitFor loop.

Suggested change
if (
!args.text &&
!args.textGone &&
!args.selector &&
!args.selectorGone &&
args.time === undefined
) {
response.error(
'Provide text, textGone, selector, selectorGone, or time to wait for.',
)
return
}
if (
args.time !== undefined &&
!args.text &&
!args.textGone &&
!args.selector &&
!args.selectorGone
) {
if (
!args.text &&
!args.textGone &&
!args.selector &&
!args.selectorGone &&
args.time === undefined
) {
response.error(
'Provide text, textGone, selector, selectorGone, or time to wait for.',
)
return
}
if (
args.time !== undefined &&
(args.text || args.textGone || args.selector || args.selectorGone)
) {
response.error(
'time cannot be combined with text, textGone, selector, or selectorGone. Use time alone for a fixed wait.',
)
return
}
if (
args.time !== undefined &&
!args.text &&
!args.textGone &&
!args.selector &&
!args.selectorGone
) {
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/tools/navigation.ts
Line: 349-368

Comment:
When `time` is provided alongside any of `text`, `textGone`, `selector`, or `selectorGone`, the fixed-wait branch is skipped and `time` is silently discarded — only the timeout-based wait runs. There is no validation or feedback to the caller. An LLM agent that passes `{ time: 2000, text: "hello" }` (intending a minimum-pause) will have the 2 000 ms wait entirely ignored and the tool will proceed straight to the `waitFor` loop.

```suggestion
    if (
      !args.text &&
      !args.textGone &&
      !args.selector &&
      !args.selectorGone &&
      args.time === undefined
    ) {
      response.error(
        'Provide text, textGone, selector, selectorGone, or time to wait for.',
      )
      return
    }

    if (
      args.time !== undefined &&
      (args.text || args.textGone || args.selector || args.selectorGone)
    ) {
      response.error(
        'time cannot be combined with text, textGone, selector, or selectorGone. Use time alone for a fixed wait.',
      )
      return
    }

    if (
      args.time !== undefined &&
      !args.text &&
      !args.textGone &&
      !args.selector &&
      !args.selectorGone
    ) {
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +336 to +347
const target =
args.text !== undefined
? `text "${args.text}"`
: args.textGone !== undefined
? `text "${args.textGone}" to disappear`
: args.selector !== undefined
? `selector "${args.selector}"`
: args.selectorGone !== undefined
? `selector "${args.selectorGone}" to disappear`
: args.time !== undefined
? `${args.time}ms`
: ''
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Misleading target when multiple conditions are provided

When multiple conditions are supplied (e.g., textGone: "Loading" and selectorGone: ".spinner"), target is resolved via a priority chain that always picks the first non-undefined field — even if a lower-priority field triggered the actual match. If .spinner disappears first but textGone is the higher-priority field, the response says target: 'text "Loading" to disappear', which misreports what was actually matched. The success message sent to the agent is equally misleading ("Found text 'Loading' to disappear on page" when it was the selector that disappeared).

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/tools/navigation.ts
Line: 336-347

Comment:
**Misleading `target` when multiple conditions are provided**

When multiple conditions are supplied (e.g., `textGone: "Loading"` and `selectorGone: ".spinner"`), `target` is resolved via a priority chain that always picks the first non-undefined field — even if a lower-priority field triggered the actual match. If `.spinner` disappears first but `textGone` is the higher-priority field, the response says `target: 'text "Loading" to disappear'`, which misreports what was actually matched. The success message sent to the agent is equally misleading ("Found text 'Loading' to disappear on page" when it was the selector that disappeared).

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +697 to +703
if (opts.selectorGone) {
const result = await session.Runtime.evaluate({
expression: `!document.querySelector(${JSON.stringify(opts.selectorGone)})`,
returnByValue: true,
})
if (result.result?.value === true) return true
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 selectorGone returns true immediately on a page where the selector never existed

The expression !document.querySelector(selector) returns true whenever the selector is absent, including on blank pages or pages that never had the element. On a freshly navigated page that hasn't yet injected a loading spinner, selectorGone: ".spinner" would resolve to true on the very first poll rather than waiting for the element to appear and then disappear. If the intent is to confirm that a previously-visible element has been removed, callers relying on selectorGone as "wait until element appears then disappears" will get a misleading success on pages that never rendered the element at all.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/browser/browser.ts
Line: 697-703

Comment:
**`selectorGone` returns `true` immediately on a page where the selector never existed**

The expression `!document.querySelector(selector)` returns `true` whenever the selector is absent, including on blank pages or pages that never had the element. On a freshly navigated page that hasn't yet injected a loading spinner, `selectorGone: ".spinner"` would resolve to `true` on the very first poll rather than waiting for the element to appear and then disappear. If the intent is to confirm that a previously-visible element has been removed, callers relying on `selectorGone` as "wait until element appears then disappears" will get a misleading success on pages that never rendered the element at all.

How can I resolve this? If you propose a fix, please make it concise.

@shadowfax92
Copy link
Copy Markdown
Contributor Author

Refinery rejected this merge request after review gate. Greptile found branch-caused wait_for behavior defects: time is silently ignored when combined with text/textGone/selector/selectorGone, and selectorGone can report success before the selector ever existed. Source issue bosmain-6fd has been reopened for rework; this branch is not merged.

@shadowfax92 Nikhil (shadowfax92) deleted the polecat/topaz/bosmain-6fd@moxnj9op branch May 9, 2026 01:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add wait_for and browser_run_code tools

1 participant