Skip to content

feat: extend OPENCLAW_GATEWAY_PRIVATE_INGRESS_NO_AUTH to handshake + role + call paths#6

Merged
shivammittal274 merged 1 commit intobrowserosfrom
feat/no-auth-handshake-bypass
May 4, 2026
Merged

feat: extend OPENCLAW_GATEWAY_PRIVATE_INGRESS_NO_AUTH to handshake + role + call paths#6
shivammittal274 merged 1 commit intobrowserosfrom
feat/no-auth-handshake-bypass

Conversation

@shivammittal274
Copy link
Copy Markdown

Summary

Follow-up to #1. That PR relaxed startup-time bind validation; this PR extends the same trust boundary to the three remaining gates that were still rejecting BrowserOS connections: the live handshake, the role policy, and the in-process backend client.

Without this, BACKEND clients (the agent's cron-tool, channels, anything that uses `callGatewayTool`) hit `"device identity required"` / WS close 1008 even on loopback when running with `OPENCLAW_GATEWAY_PRIVATE_INGRESS_NO_AUTH=1`.

Repro (before this PR)

  1. Build a BrowserOS no-auth image off `browseros` HEAD.
  2. Run an agent against it. Plain text turns work.
  3. Ask the agent to schedule a cron job (e.g. "run every minute for 10 minutes").
  4. Agent reports back: `Error: gateway closed (1008): device identity required`.

The agent's `cron-tool` calls `callGateway` (`src/gateway/call.ts`) with `mode=BACKEND, clientName=GATEWAY_CLIENT, no token/password`. `shouldOmitDeviceIdentityForGatewayCall` is gated on `hasSharedAuth` → returns false → the call tries to load `device.json` and connect with a device. The handshake then either: (a) the device file isn't accessible in the in-process backend context and connect goes out without a device, or (b) some path bypasses the device altogether — either way, `message-handler.ts:713` closes with 1008.

What's in this PR

Three changes, each gated identically on `isLocalClient && allowsGatewayPrivateIngressNoAuth()`:

1. `server/ws-connection/connect-policy.ts` — handshake bypass

`evaluateMissingDeviceIdentity` now returns `{ kind: "allow" }` for loopback connections when the env flag is set. Closes the 1008 path.

2. `role-policy.ts` — role bypass

`roleCanSkipDeviceIdentity` accepts an optional `isLocalClient` and treats no-auth-loopback as shared-auth-equivalent for operator role. Keeps role policy and missing-device decisions in sync.

3. `gateway/call.ts` — client-side device omission

`shouldOmitDeviceIdentityForGatewayCall` no longer gates only on `hasSharedAuth`; under no-auth + loopback the backend gateway-client skips device identity entirely. Defense in depth so an in-process backend tool call succeeds even if `device.json` can't be read.

Trust boundary

Unchanged from #1. Every relaxation in this PR requires:

  • `OPENCLAW_GATEWAY_PRIVATE_INGRESS_NO_AUTH=1` is set (env flag)
  • AND `isLocalClient` is true (loopback remote address, no proxy headers)

Same posture as your existing `server-runtime-config.ts:149` and `gateway-cli/run.ts:813` bypasses. No remote client gains anything.

Validation

  • `pnpm tsgo:core` clean
  • `pnpm vitest run src/gateway/role-policy.test.ts src/gateway/server/ws-connection/connect-policy.test.ts` — 30 / 30 existing tests pass
  • I held off adding new test cases — happy to add them in a follow-up if you want them inline; figured you'd have an opinion on test placement (unit in those files vs an integration suite under `gateway/server.auth.modes.suite.ts`).

Stats

3 files changed, +34 / −4

```
src/gateway/call.ts | 11 ++++++++++-
src/gateway/role-policy.ts | 18 ++++++++++++++++--
src/gateway/server/ws-connection/connect-policy.ts | 9 ++++++++-
```

Audit context

I audited the codebase for every place the no-auth flag should reach. Findings:

# Site Severity This PR
1 `message-handler.ts` handshake CRITICAL ✅ via connect-policy
2 `role-policy.ts` role check HIGH
3 `call.ts` client omission MEDIUM
4 ControlUI `dangerouslyDisableDeviceAuth` separate path; not touched
5 HTTP / MCP endpoints MAYBE not touched (they use shared-token auth, separate concern)
6 CLI `--gateway-auth=none` already in #1
7 Dockerfile env passthrough works via `-e`

So this closes the active set. #5 may need follow-up if HTTP-tool flows surface similar issues, but there's no known regression there yet.

Test plan

  • Build a fresh `browseros-ai/openclaw:browseros` image off this branch.
  • Repro the cron scenario in BrowserOS — should now succeed.
  • Confirm chat turns still work (no regression).
  • Confirm a remote (non-loopback) client without auth still gets rejected.

…role + call paths

Follow-up to #1 (`feat: allow BrowserOS private no-auth gateway images`).
That PR relaxed the startup-time bind validation in
`server-runtime-config.ts:149` and `gateway-cli/run.ts:813`. The same
trust boundary needs to extend to the live handshake and the
in-process backend client paths, otherwise BACKEND/GATEWAY_CLIENT calls
(e.g. the agent's cron-tool) hit "device identity required" / 1008 even
on loopback.

This commit closes three remaining gaps, each gated identically on
`isLocalClient && allowsGatewayPrivateIngressNoAuth()`:

1. `server/ws-connection/connect-policy.ts` — `evaluateMissingDeviceIdentity`
   now returns `{ kind: "allow" }` when the connection is loopback and the
   env flag is set. Closes the 1008 close at `message-handler.ts:713`
   that the agent's cron tool, channels, and any future backend tool
   without explicit shared-auth would hit.

2. `role-policy.ts` — `roleCanSkipDeviceIdentity` accepts an optional
   `isLocalClient` parameter and treats no-auth-loopback as
   shared-auth-equivalent for operator role. Single call site updated
   in `connect-policy.ts`. Keeps role and missing-device decisions in
   sync (otherwise the handshake permits a connection that role policy
   then blocks downstream).

3. `gateway/call.ts` — `shouldOmitDeviceIdentityForGatewayCall` no longer
   gates only on `hasSharedAuth`; under no-auth + loopback it also lets
   the client skip device identity entirely. Defense in depth so that
   in-process backend tool calls succeed even if `device.json` can't be
   read (perms, missing dir, subprocess context).

Trust boundary unchanged: every relaxation requires
`OPENCLAW_GATEWAY_PRIVATE_INGRESS_NO_AUTH=1` AND `isLocalClient` (loopback
remote address with no proxy headers). Same posture as the existing
patches in #1.

Verified: `pnpm tsgo:core` clean. Existing connect-policy and
role-policy unit tests (30 cases) all still pass.
@shivammittal274 shivammittal274 merged commit 708a072 into browseros May 4, 2026
65 of 76 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant