feat(container): richen ManagedContainer — isImageCurrent + logs + sibling-exec by DaniAkash · Pull Request #968 · browseros-ai/BrowserOS

Dani Akash (DaniAkash) · 2026-05-08T08:49:21Z

Summary

Adds four primitives to the ManagedContainer abstract base ahead of the OpenClaw runtime migration. Without these, the upcoming OpenClawContainerRuntime would need to re-implement them locally; with them, the migration consumes shared base behaviour and Hermes silently inherits the new surface (unused but free).

isImageCurrent(): Promise<boolean> — pure predicate comparing the existing container's image ref to descriptor.defaultImage. Treats SHA-pinned variants (<ref>@sha256:…) as matches. start() is unchanged — today's "always recreate from a fresh buildContainerSpec()" semantics stays in place; subclasses and service layers compose the predicate where they want a short-circuit.
getLogs(tail = 50): Promise<string[]> and tailLogs(onLine: LogFn): () => void — generic log primitives, thin pass-throughs to the underlying ContainerCli.
runOneShot(argv, opts?): Promise<ExecResult> — sibling-container helper. Spawns a <container-name>-setup container with the same image / mounts / add-hosts / base env (no ports / health / restart), runs argv, force-removes after. Includes the retry-on-name-collision behaviour previously bespoke to OpenClaw's gateway-setup path.

Net diff: +160 LOC source, +165 LOC tests.

Why

OpenClaw's pre-ManagedContainer ContainerRuntime class carries five auxiliary methods that aren't OpenClaw-specific — image-ref comparison, logs, sibling-container exec — they're container-generic. Lifting them onto the base means the upcoming Phase 4a runtime migration is a smaller diff and Hermes (plus future container adapters) get a richer surface for free. The discussion that motivated this lives in the team plan files; the short version: scope the migration onto a cleaner foundation rather than dragging the auxiliary surface onto a single subclass.

Test plan

bun run typecheck clean across the server package
biome check clean on touched sources + tests
10 new test cases in managed-container.test.ts:
- isImageCurrent: ref match, SHA-pinned match, ref differs, container missing
- getLogs: collects lines from cli.runCommand(['logs', '-n', N, name])
- tailLogs: returns the unsubscribe handle from cli.tailLogs
- runOneShot: creates the -setup sibling with no ports/health, runs argv, force-removes; cleans up even on inner throw; retries on ContainerNameInUseError
Full server test sweep — 1037 pass, 0 fail (one pre-existing ContainerCli flake also reproduces on plain origin/dev)
No Hermes-side changes needed — the new methods are unused on the Hermes runtime; existing tests stay green

… to ManagedContainer Four base-class additions ahead of the OpenClaw runtime migration so the upcoming subclass doesn't have to re-implement them: - isImageCurrent() — pure predicate comparing the existing container's image ref to descriptor.defaultImage. Treats SHA-pinned variants as matches. start() is unchanged; subclasses + service layers compose the predicate where they want short-circuit behaviour. - getLogs(tail) and tailLogs(onLine) — generic log primitives, thin pass-throughs to ContainerCli. - runOneShot(argv, opts) — sibling-container helper that spawns a <name>-setup container with the same image+mounts+env (no ports/ health/restart), runs argv, force-removes after. Includes the retry-on-name-collision behaviour previously bespoke to OpenClaw. Hermes inherits unused surface only — no behavioural change. The in-flight base-class tests cover all four primitives.

github-actions · 2026-05-08T08:51:28Z

✅ Tests passed — 1219/1223

Suite	Passed	Skipped
✅ `agent`	76/76	0
✅ `build`	9/9	0
✅ `eval`	93/93	0
✅ `server-agent`	261/261	0
✅ `server-api`	203/203	0
✅ `server-browser`	4/4	0
✅ `server-integration`	9/10	1
✅ `server-lib`	242/242	0
✅ `server-root`	60/63	3
✅ `server-skills`	31/31	0
✅ `server-tools`	231/231	0

View workflow run

greptile-apps · 2026-05-08T08:54:26Z

Greptile Summary

This PR enriches ManagedContainer with four new primitives — isImageCurrent, getLogs, tailLogs, and runOneShot — that are container-generic and will be consumed by the upcoming OpenClawContainerRuntime migration rather than being re-implemented per subclass.

isImageCurrent compares the running container's image ref to descriptor.defaultImage, treating SHA-pinned variants as matches; it is a pure predicate and never called by the base machinery.
getLogs / tailLogs are thin pass-throughs to ContainerCli.runCommand and ContainerCli.tailLogs respectively, with getLogs collecting interleaved stdout+stderr lines.
runOneShot spawns a -setup sibling container with no ports/health/restart, runs the supplied argv, and force-removes the sibling in a finally block; includes a retry-on-ContainerNameInUseError loop in createOneShotContainer and an optional process timeout via Promise.race.

Confidence Score: 4/5

Safe to merge with awareness of two minor behavioural gaps in the new surface area.

The implementation is well-structured and the retry/cleanup logic in runOneShot is robust. getLogs does not check the CLI exit code so nerdctl error messages are silently collected into the returned log lines rather than triggering a distinct error path; and runWithOptionalTimeout abandons the underlying runCommand promise when the timeout fires, leaving a brief window where opts.onLog can still be called after the caller has received the timeout error. Neither affects the live container or existing Hermes code, but callers building on these new primitives should be aware before the OpenClaw migration consumes them.

The getLogs and runWithOptionalTimeout helpers in managed-container.ts are the spots most worth a second look before this surface is consumed by OpenClawContainerRuntime.

Important Files Changed

Filename	Overview
packages/browseros-agent/apps/server/src/lib/container/managed/managed-container.ts	Adds isImageCurrent, getLogs, tailLogs, and runOneShot to ManagedContainer; getLogs doesn't check the CLI exit code so error text can leak into the returned lines array, and runWithOptionalTimeout leaves the underlying runCommand promise floating after a timeout fires.
packages/browseros-agent/apps/server/tests/lib/container/managed/managed-container.test.ts	Adds 10 new test cases covering isImageCurrent, getLogs, tailLogs, and runOneShot (including retry and cleanup-on-throw paths); all tests mock cli methods directly, so the getLogs exit-code gap and the timeout callback-leak scenario are not exercised.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant ManagedContainer
    participant ContainerCli

    Note over Caller,ContainerCli: isImageCurrent
    Caller->>ManagedContainer: isImageCurrent()
    ManagedContainer->>ContainerCli: containerImageRef(containerName)
    ContainerCli-->>ManagedContainer: "actual ref | null"
    ManagedContainer-->>Caller: boolean (exact match or SHA-pinned variant)

    Note over Caller,ContainerCli: getLogs
    Caller->>ManagedContainer: getLogs(tail?)
    ManagedContainer->>ContainerCli: runCommand(['logs','-n',N,name], onLog)
    ContainerCli-->>ManagedContainer: lines via onLog callbacks
    ManagedContainer-->>Caller: string[]

    Note over Caller,ContainerCli: runOneShot
    Caller->>ManagedContainer: runOneShot(argv, opts?)
    ManagedContainer->>ManagedContainer: withLifecycleLock('run-one-shot')
    ManagedContainer->>ManagedContainer: buildContainerSpec()
    ManagedContainer->>ContainerCli: removeContainer(setupName, force)
    ManagedContainer->>ContainerCli: waitForContainerNameRelease(setupName)
    ManagedContainer->>ContainerCli: createContainer(setupSpec)
    alt ContainerNameInUseError (up to 3 retries)
        ManagedContainer->>ContainerCli: removeContainer + waitForRelease + createContainer
    end
    ManagedContainer->>ContainerCli: runCommand(['start','-a',setupName])
    alt processTimeoutMs set
        Note over ManagedContainer: Promise.race vs timeout
    end
    ContainerCli-->>ManagedContainer: ExecResult
    ManagedContainer->>ContainerCli: removeContainer(setupName, force) [finally]
    ManagedContainer-->>Caller: ExecResult

Prompt To Fix All With AI

Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
packages/browseros-agent/apps/server/src/lib/container/managed/managed-container.ts:293-300
**`getLogs` silently surfaces error text as log lines**

`runCommand` passes both stdout and stderr through the `onLog` callback, and `getLogs` collects everything indiscriminately without checking the returned `exitCode`. When the container doesn't exist, `nerdctl logs` exits with code 1 and writes a message like `Error: no such container: …` to stderr — which ends up in the returned `lines[]`. Callers get back an array that looks like valid log output but is actually an nerdctl error string, with no way to distinguish the two cases. Consider checking `result.exitCode` and returning `[]` (or throwing) on failure.

### Issue 2 of 2
packages/browseros-agent/apps/server/src/lib/container/managed/managed-container.ts:395-419
**Abandoned `runCommand` promise may deliver stale `onLog` callbacks after timeout**

When the timeout fires, `Promise.race` rejects and `runWithOptionalTimeout` returns, but the `cli.runCommand(['start', '-a', setupName], opts.onLog)` promise is abandoned — it continues executing in the background. The `opts.onLog` callback will still be called by that floating promise until `removeContainer(force: true)` in the `finally` block kills the container (which causes `nerdctl start -a` to exit). During that brief window, the caller's `onLog` handler can fire after `runOneShot` has already thrown the timeout error. If the handler closes over any state that the caller tears down on the timeout error (e.g. a streaming buffer or a WebSocket), those stale calls could be problematic. Capturing the `runCommand` promise and awaiting it in the `finally` path would close the gap.

_{Reviews (1): Last reviewed commit: "feat(container): add isImageCurrent + ge..." | Re-trigger Greptile}

…-onLog leak; trim docstrings - getLogs now distinguishes a missing container (returns []) from other CLI failures (throws). Previously nerdctl's stderr ("Error: no such container: …") leaked into the lines array as if it were log output. isNoSuchContainer is exported from container-cli to share the predicate. - runWithOptionalTimeout wraps the caller's onLog so post-timeout lines from the abandoned runCommand promise become no-ops; before this, callers could see onLog fire after runOneShot had already rejected, hitting state the caller may have torn down on the timeout error. - Tightens the new docstrings to one short line per the project convention; drops a restating comment in the test file.

github-actions Bot added the Feature label May 8, 2026

greptile-apps Bot reviewed May 8, 2026

View reviewed changes

Comment thread packages/browseros-agent/apps/server/src/lib/container/managed/managed-container.ts

Comment thread packages/browseros-agent/apps/server/src/lib/container/managed/managed-container.ts

Dani Akash (DaniAkash) merged commit 4e40568 into dev May 8, 2026
20 checks passed

Dani Akash (DaniAkash) deleted the feat/managed-container-richen branch May 8, 2026 10:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(container): richen ManagedContainer — isImageCurrent + logs + sibling-exec#968

feat(container): richen ManagedContainer — isImageCurrent + logs + sibling-exec#968
Dani Akash (DaniAkash) merged 2 commits into
devfrom
feat/managed-container-richen

Dani Akash (DaniAkash) commented May 8, 2026

Uh oh!

github-actions Bot commented May 8, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 8, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Dani Akash (DaniAkash) commented May 8, 2026

Summary

Why

Test plan

Uh oh!

github-actions Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Tests passed — 1219/1223

Uh oh!

greptile-apps Bot commented May 8, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented May 8, 2026 •

edited

Loading