Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion packages/browseros-agent/apps/server/src/agent/prompt.ts
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,7 @@ You control a Chromium browser. Key tool categories:
- \`get_dom\` / \`search_dom\` → raw HTML (use for precise CSS/XPath queries)
- \`take_screenshot\` → visual capture (use for verification or saving)
- \`evaluate_script\` → run JS on the page (use for dynamic data extraction)
- \`browser_run_code\` → run custom async page code when predefined tools are insufficient
- \`get_console_logs\` → browser console output (use for debugging)

**Interaction** — act on page elements:
Expand All @@ -152,6 +153,7 @@ You control a Chromium browser. Key tool categories:
**Navigation**:
- \`navigate_page\` → go to URL, back, forward, reload
- \`new_page\` → open new tab (only when user explicitly asks)
- \`wait_for\` → wait for text/selectors to appear or disappear, or pause briefly
- \`close_page\` → close a tab

**Bookmarks**: \`get_bookmarks\`, \`create_bookmark\`, \`remove_bookmark\`, \`update_bookmark\`, \`move_bookmark\`, \`search_bookmarks\`
Expand Down Expand Up @@ -315,6 +317,7 @@ function getToolSelection(
| Looking for specific links | \`get_page_links\` |
| Need exact HTML or CSS selectors | \`get_dom\` or \`search_dom\` |
| Need runtime data (JS variables, computed values) | \`evaluate_script\` |
| Need custom async page logic | \`browser_run_code\` |
| Something isn't working, need to debug | \`get_console_logs\` |
| Need visual proof or to save an image | \`take_screenshot\` or \`save_screenshot\` |

Expand Down Expand Up @@ -417,7 +420,7 @@ function getErrorRecovery(
## Error Recovery

### Browser interaction errors
- Element not found → \`scroll(page, "down")\`, \`wait_for(page, text)\`, then \`take_snapshot(page)\` to re-fetch elements
- Element not found → \`scroll(page, "down")\`, \`wait_for(page, { text })\`, then \`take_snapshot(page)\` to re-fetch elements
- Click/fill failed → \`scroll(page, "down", element)\` into view, retry once
- Page didn't load → check URL, try \`navigate_page\` with reload
- After 2 failed attempts → describe the blocking issue, request guidance
Expand Down
76 changes: 73 additions & 3 deletions packages/browseros-agent/apps/server/src/browser/browser.ts
Original file line number Diff line number Diff line change
Expand Up @@ -657,13 +657,21 @@ export class Browser {

async waitFor(
page: number,
opts: { text?: string; selector?: string; timeout: number },
opts: {
text?: string
textGone?: string
selector?: string
selectorGone?: string
timeout: number
},
): Promise<boolean> {
const session = await this.resolveSession(page)
const deadline = Date.now() + opts.timeout
const interval = 500
let textGoneWasPresent = false
let selectorGoneWasPresent = false

while (Date.now() < deadline) {
while (Date.now() <= deadline) {
if (opts.text) {
const result = await session.Runtime.evaluate({
expression: `document.body?.innerText?.includes(${JSON.stringify(opts.text)}) ?? false`,
Expand All @@ -672,6 +680,18 @@ export class Browser {
if (result.result?.value === true) return true
}

if (opts.textGone) {
const result = await session.Runtime.evaluate({
expression: `document.body?.innerText?.includes(${JSON.stringify(opts.textGone)}) ?? false`,
returnByValue: true,
})
if (result.result?.value === true) {
textGoneWasPresent = true
} else if (textGoneWasPresent) {
return true
}
}
Comment on lines +683 to +693
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 textGone returns true prematurely on an unloaded page

The expression !(document.body?.innerText?.includes(...) ?? false) evaluates to true when document.body is null or undefined (page still loading). The optional chain short-circuits to undefined, undefined ?? false becomes false, and !false is true, so the condition reports the text as "gone" even before any content has rendered. Contrast with the text check which uses the same pattern affirmatively — it correctly returns false (not found) when the body is absent.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/browser/browser.ts
Line: 682-688

Comment:
**`textGone` returns true prematurely on an unloaded page**

The expression `!(document.body?.innerText?.includes(...) ?? false)` evaluates to `true` when `document.body` is `null` or `undefined` (page still loading). The optional chain short-circuits to `undefined`, `undefined ?? false` becomes `false`, and `!false` is `true`, so the condition reports the text as "gone" even before any content has rendered. Contrast with the `text` check which uses the same pattern affirmatively — it correctly returns `false` (not found) when the body is absent.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +683 to +693
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 textGone lacks the "was-present" guard that selectorGone has

selectorGone only reports success after the element was observed at least once (via selectorGoneWasPresent), preventing a false positive when the selector simply never existed. textGone has no equivalent guard and returns true on the very first poll if the text is already absent. A caller doing waitFor({ textGone: 'Loading…' }) before the page has rendered any text would receive found: true immediately, even though the text never appeared.

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/browseros-agent/apps/server/src/browser/browser.ts
Line: 682-688

Comment:
**`textGone` lacks the "was-present" guard that `selectorGone` has**

`selectorGone` only reports success after the element was observed at least once (via `selectorGoneWasPresent`), preventing a false positive when the selector simply never existed. `textGone` has no equivalent guard and returns `true` on the very first poll if the text is already absent. A caller doing `waitFor({ textGone: 'Loading…' })` before the page has rendered any text would receive `found: true` immediately, even though the text never appeared.

How can I resolve this? If you propose a fix, please make it concise.


if (opts.selector) {
const result = await session.Runtime.evaluate({
expression: `!!document.querySelector(${JSON.stringify(opts.selector)})`,
Expand All @@ -680,7 +700,21 @@ export class Browser {
if (result.result?.value === true) return true
}

await new Promise((r) => setTimeout(r, interval))
if (opts.selectorGone) {
const result = await session.Runtime.evaluate({
expression: `!!document.querySelector(${JSON.stringify(opts.selectorGone)})`,
returnByValue: true,
})
if (result.result?.value === true) {
selectorGoneWasPresent = true
} else if (selectorGoneWasPresent) {
return true
}
}

const remaining = deadline - Date.now()
if (remaining <= 0) break
await new Promise((r) => setTimeout(r, Math.min(interval, remaining)))
}

return false
Expand Down Expand Up @@ -941,6 +975,42 @@ export class Browser {
}
}

async runCode(
page: number,
code: string,
args?: Record<string, unknown>,
): Promise<{
value?: unknown
error?: string
description?: string
}> {
const session = await this.resolveSession(page)
const expression = `(
async (args) => {
${code}
}
)(${JSON.stringify(args ?? {})})`

const result = await session.Runtime.evaluate({
expression,
returnByValue: true,
awaitPromise: true,
})

if (result.exceptionDetails) {
return {
error:
result.exceptionDetails.exception?.description ??
result.exceptionDetails.text,
}
}

return {
value: result.result?.value,
description: result.result?.description,
}
}

async getDom(page: number, opts?: { selector?: string }): Promise<string> {
const session = await this.resolveSession(page)
const doc = await session.DOM.getDocument({ depth: 0 })
Expand Down
87 changes: 73 additions & 14 deletions packages/browseros-agent/apps/server/src/tools/navigation.ts
Original file line number Diff line number Diff line change
Expand Up @@ -299,13 +299,29 @@ export const close_page = defineNavigationTool({
export const wait_for = defineNavigationTool({
name: 'wait_for',
description:
'Wait for text or a CSS selector to appear on the page. Polls periodically up to a timeout.',
'Wait for text or a CSS selector to appear or disappear on the page, or wait for a fixed time. Polls periodically up to a timeout.',
input: z.object({
page: pageParam,
text: z.string().optional().describe('Text to wait for on the page'),
textGone: z
.string()
.optional()
.describe('Text to wait for to disappear from the page'),
selector: z.string().optional().describe('CSS selector to wait for'),
selectorGone: z
.string()
.optional()
.describe('CSS selector to wait for to disappear from the page'),
time: z
.number()
.min(0)
.max(120000)
.optional()
.describe('Fixed wait time in milliseconds'),
timeout: z
.number()
.min(0)
.max(120000)
.default(10000)
.describe('Maximum wait time in milliseconds'),
}),
Expand All @@ -316,40 +332,83 @@ export const wait_for = defineNavigationTool({
timeout: z.number(),
}),
handler: async (args, ctx, response) => {
if (!args.text && !args.selector) {
response.error('Provide either text or selector to wait for.')
const timeout = args.timeout ?? 10_000
const conditionCount = [
args.text !== undefined,
args.textGone !== undefined,
args.selector !== undefined,
args.selectorGone !== undefined,
args.time !== undefined,
].filter(Boolean).length

if (conditionCount === 0) {
response.error(
'Provide text, textGone, selector, selectorGone, or time to wait for.',
)
return
}

if (conditionCount > 1) {
response.error(
'Provide exactly one wait condition. time cannot be combined with text, textGone, selector, or selectorGone.',
)
return
}

const target =
args.text !== undefined
? `text "${args.text}"`
: args.textGone !== undefined
? `text "${args.textGone}" to disappear`
: args.selector !== undefined
? `selector "${args.selector}"`
: args.selectorGone !== undefined
? `selector "${args.selectorGone}" to disappear`
: args.time !== undefined
? `${args.time}ms`
: ''

if (args.time !== undefined) {
await new Promise((resolve) => setTimeout(resolve, args.time))
response.text(`Waited ${args.time}ms.`)
response.data({
page: args.page,
found: true,
target,
timeout: args.time,
})
return
}

const found = await ctx.browser.waitFor(args.page, {
text: args.text,
textGone: args.textGone,
selector: args.selector,
timeout: args.timeout,
selectorGone: args.selectorGone,
timeout,
})

if (found) {
const target = args.text
? `text "${args.text}"`
: `selector "${args.selector}"`
response.text(`Found ${target} on page.`)
const foundMessage =
args.textGone !== undefined || args.selectorGone !== undefined
? `Condition met: ${target}.`
: `Found ${target} on page.`
response.text(foundMessage)
response.data({
page: args.page,
found,
target,
timeout: args.timeout,
timeout,
})
response.includeSnapshot(args.page)
} else {
const target = args.text
? `text "${args.text}"`
: `selector "${args.selector}"`
response.data({
page: args.page,
found,
target,
timeout: args.timeout,
timeout,
})
response.error(`Timed out after ${args.timeout}ms waiting for ${target}.`)
response.error(`Timed out after ${timeout}ms waiting for ${target}.`)
}
},
})
9 changes: 5 additions & 4 deletions packages/browseros-agent/apps/server/src/tools/registry.ts
Original file line number Diff line number Diff line change
Expand Up @@ -43,12 +43,12 @@ import {
new_hidden_page,
new_page,
show_page,
// biome-ignore lint/correctness/noUnusedImports: temporarily disabled
wait_for,
} from './navigation'
import { suggest_app_connection, suggest_schedule } from './nudges'
import { download_file, save_pdf, save_screenshot } from './page-actions'
import {
browser_run_code,
evaluate_script,
get_page_content,
get_page_links,
Expand All @@ -73,7 +73,7 @@ import {
} from './windows'

export const registry = createRegistry([
// Navigation (8)
// Navigation (9)
get_active_page,
list_pages,
navigate_page,
Expand All @@ -82,9 +82,9 @@ export const registry = createRegistry([
show_page,
move_page,
close_page,
// wait_for, // temporarily disabled
wait_for,

// Observation (9)
// Observation (10)
take_snapshot,
take_enhanced_snapshot,
get_page_content,
Expand All @@ -93,6 +93,7 @@ export const registry = createRegistry([
search_dom,
take_screenshot,
evaluate_script,
browser_run_code,
get_console_logs,

// Input (17)
Expand Down
49 changes: 49 additions & 0 deletions packages/browseros-agent/apps/server/src/tools/snapshot.ts
Original file line number Diff line number Diff line change
Expand Up @@ -246,3 +246,52 @@ export const evaluate_script = defineScriptTool({
})
},
})

export const browser_run_code = defineScriptTool({
name: 'browser_run_code',
description:
'Execute async custom JavaScript code in the page context. The code runs as an async function body with a serializable args object available and may use return to produce a result.',
input: z.object({
page: pageParam,
code: z
.string()
.describe(
'JavaScript function body to run in the page context. Use return to provide output.',
),
args: z
.record(z.unknown())
.optional()
.describe('Serializable arguments available to the code as args'),
}),
output: z.object({
text: z.string(),
value: z.unknown().optional(),
description: z.string().optional(),
}),
handler: async (args, ctx, response) => {
const result = await ctx.browser.runCode(args.page, args.code, args.args)

if (result.error) {
response.error(`Code error: ${result.error}`)
return
}

const val = result.value
let text: string
if (val === undefined) {
text = result.description ?? 'undefined'
response.text(text)
} else if (typeof val === 'string') {
text = val
response.text(text)
} else {
text = JSON.stringify(val, null, 2)
response.text(text)
}
response.data({
text,
value: result.value,
description: result.description,
})
},
})
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ const VERB_OVERRIDES: Record<string, string> = {
get_active_page: 'Got active tab',
move_page: 'Moved tab',
group_tabs: 'Grouped tabs',
wait_for: 'Waited for page state',

// Page reading
take_snapshot: 'Captured page snapshot',
Expand Down Expand Up @@ -47,6 +48,7 @@ const VERB_OVERRIDES: Record<string, string> = {

// Console / scripts
evaluate_script: 'Ran script',
browser_run_code: 'Ran custom page code',
get_console_logs: 'Read console logs',

// History / bookmarks
Expand Down Expand Up @@ -292,12 +294,9 @@ function canonicalName(rawName: string): string {
function humanizeToolName(rawName: string): string {
const stripped = canonicalName(rawName)
const words = stripped.split(/[_-]/).filter((w) => w.length > 0)
if (words.length === 0) return rawName
const first = words[0]!
return [
first.charAt(0).toUpperCase() + first.slice(1),
...words.slice(1),
].join(' ')
const [first, ...rest] = words
if (!first) return rawName
return [first.charAt(0).toUpperCase() + first.slice(1), ...rest].join(' ')
}

/**
Expand Down
Loading
Loading