fix(onboard): recreate sandbox on model/provider drift for same-name …#1521
fix(onboard): recreate sandbox on model/provider drift for same-name …#1521ChunkyMonkey11 wants to merge 5 commits intoNVIDIA:mainfrom
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
✅ Files skipped from review due to trivial changes (2)
📝 WalkthroughWalkthroughDownloads and robustly parses a sandbox's persisted Changes
Sequence DiagramsequenceDiagram
participant User
participant Onboard as Onboarding Process
participant OpenShell as OpenShell CLI
participant FS as Temp FS
participant Drift as Drift Detector
participant Confirm as Confirmation Dialog
participant Sandbox as Sandbox Manager
User->>Onboard: start onboarding (sandboxName, provider, model)
Onboard->>OpenShell: run "sandbox download" for /.nemoclaw/config.json
OpenShell->>FS: extract archive into temp dir
Onboard->>FS: locate config.json (recursive) & robust-parse JSON
Onboard->>Drift: compare stored provider/model -> requested provider/model
alt drift detected
Drift->>Confirm: require recreate confirmation
alt interactive
Confirm->>User: prompt "Recreate sandbox to apply provider/model change?"
User-->>Confirm: accept / decline
alt accept
Confirm->>Sandbox: delete sandbox
Sandbox->>Sandbox: create sandbox with requested selection
else decline
Confirm->>Onboard: abort onboarding (exit 1)
end
else non-interactive
Drift->>Sandbox: auto-recreate sandbox (log reason)
end
else no drift
Onboard->>Sandbox: reuse existing sandbox (upsert messaging providers if needed)
end
Sandbox->>FS: cleanup temp dir
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
@coderabbitai please re-review The branch head changed after I merged |
|
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
test/onboard.test.js (1)
1977-1999: These drift tests grep source text instead of exercising the behavior.They can miss runtime regressions and also fail on harmless helper/log renames. Please reuse the existing spawn-based harness patterns in this file to assert actual
createSandbox()outcomes for provider-only drift, model drift, and unreadable config.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/onboard.test.js` around lines 1977 - 1999, Replace the static source-text grep assertions with real integration tests that spawn the onboarding binary and assert createSandbox() behavior: use the existing spawn-based harness pattern already present in this test suite to run scenarios that trigger provider-only drift, model drift, and unreadable config; invoke the code paths that call getSelectionDrift and confirmRecreateForSelectionDrift, then assert the process output/exit codes and that createSandbox was actually recreated or left unchanged as expected (including interactive prompt behavior for confirmRecreateForSelectionDrift), rather than matching log helper text.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bin/lib/onboard.js`:
- Around line 924-967: readSandboxSelectionConfig can return null on
download/parse failures, but getSelectionDrift currently treats null as "no
drift"; change getSelectionDrift so that when
readSandboxSelectionConfig(sandboxName) returns null (unknown state) it does not
silently report changed: false — instead set changed: true to force
reconfiguration (or, if you prefer to surface unknown explicitly, set changed:
true and set providerChanged/modelChanged to null to indicate unknown rather
than false). Update the getSelectionDrift function to detect existing === null,
return changed: true, keep existingProvider/existingModel as null, and set
providerChanged/modelChanged to null (or true if you choose force-recreate
semantics) so callers can distinguish an actual mismatch from an
unreadable/unknown config.
---
Nitpick comments:
In `@test/onboard.test.js`:
- Around line 1977-1999: Replace the static source-text grep assertions with
real integration tests that spawn the onboarding binary and assert
createSandbox() behavior: use the existing spawn-based harness pattern already
present in this test suite to run scenarios that trigger provider-only drift,
model drift, and unreadable config; invoke the code paths that call
getSelectionDrift and confirmRecreateForSelectionDrift, then assert the process
output/exit codes and that createSandbox was actually recreated or left
unchanged as expected (including interactive prompt behavior for
confirmRecreateForSelectionDrift), rather than matching log helper text.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 57e767ad-4022-4702-a0e2-9a7d62fb935c
📒 Files selected for processing (2)
bin/lib/onboard.jstest/onboard.test.js
There was a problem hiding this comment.
♻️ Duplicate comments (1)
bin/lib/onboard.js (1)
950-963:⚠️ Potential issue | 🟠 MajorUnreadable selection config is still treated as “no drift.”
At Line 951,
readSandboxSelectionConfig()can returnnull; at Line 962 this currently resolves tochanged: false, which can silently reuse a sandbox when provider/model state is actually unknown.🔧 Proposed fix
function getSelectionDrift(sandboxName, requestedProvider, requestedModel) { const existing = readSandboxSelectionConfig(sandboxName); + if (existing === null) { + return { + changed: true, + providerChanged: false, + modelChanged: false, + existingProvider: null, + existingModel: null, + unknown: true, + }; + } const existingProvider = existing && typeof existing.provider === "string" ? existing.provider : null; const existingModel = existing && typeof existing.model === "string" ? existing.model : null;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bin/lib/onboard.js` around lines 950 - 963, getSelectionDrift currently treats a missing or unreadable selection config as "no drift"; change it so that when readSandboxSelectionConfig(sandboxName) returns null/invalid, the function treats the state as drifted. Specifically, in getSelectionDrift (and using the existing variables existing, existingProvider, existingModel, providerChanged, modelChanged), detect if existing is falsy or has non-string provider/model and set the returned changed flag to true (and/or set providerChanged/modelChanged appropriately) so we don't silently reuse a sandbox when the selection config is unknown.
🧹 Nitpick comments (1)
test/onboard.test.js (1)
1977-1999: Prefer behavior-driven assertions over source-text regex for drift logic.These checks can pass even if runtime drift handling regresses but strings remain. Consider adding one execution-path test that stubs
sandbox download+ prompt result and asserts reuse vs recreate/abort outcomes directly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/onboard.test.js` around lines 1977 - 1999, Replace the brittle source-text regex tests with an execution-path test that stubs the runtime behavior: import the onboard flow (the module that calls getSelectionDrift), stub sandbox.download (or the sandbox object used by onboard.js) to simulate an existing sandbox, stub getSelectionDrift to return changed=true and changed=false cases, and stub the interactive prompt used by confirmRecreateForSelectionDrift to return yes/no; then assert the actual outcome (that the code calls the recreate path vs the reuse/abort path) by spying on the functions that perform recreate (e.g., sandbox.recreate or the recreate handler) and reuse (e.g., sandbox.connect) — exercise both changed=true with prompt=confirm (expect recreate) and changed=true with prompt=abort (expect abort/no recreate) and changed=false (expect reuse), so tests validate runtime behavior rather than matching strings.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@bin/lib/onboard.js`:
- Around line 950-963: getSelectionDrift currently treats a missing or
unreadable selection config as "no drift"; change it so that when
readSandboxSelectionConfig(sandboxName) returns null/invalid, the function
treats the state as drifted. Specifically, in getSelectionDrift (and using the
existing variables existing, existingProvider, existingModel, providerChanged,
modelChanged), detect if existing is falsy or has non-string provider/model and
set the returned changed flag to true (and/or set providerChanged/modelChanged
appropriately) so we don't silently reuse a sandbox when the selection config is
unknown.
---
Nitpick comments:
In `@test/onboard.test.js`:
- Around line 1977-1999: Replace the brittle source-text regex tests with an
execution-path test that stubs the runtime behavior: import the onboard flow
(the module that calls getSelectionDrift), stub sandbox.download (or the sandbox
object used by onboard.js) to simulate an existing sandbox, stub
getSelectionDrift to return changed=true and changed=false cases, and stub the
interactive prompt used by confirmRecreateForSelectionDrift to return yes/no;
then assert the actual outcome (that the code calls the recreate path vs the
reuse/abort path) by spying on the functions that perform recreate (e.g.,
sandbox.recreate or the recreate handler) and reuse (e.g., sandbox.connect) —
exercise both changed=true with prompt=confirm (expect recreate) and
changed=true with prompt=abort (expect abort/no recreate) and changed=false
(expect reuse), so tests validate runtime behavior rather than matching strings.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b1330725-b633-4790-bd7d-d83a1ef03843
📒 Files selected for processing (2)
bin/lib/onboard.jstest/onboard.test.js
|
All good to go pasts the test cases. |
|
✨ Thanks for submitting this fix, which proposes a way to detect provider/model drift during re-onboarding and prevent silent reuse of stale sandboxes. Possibly related open issues: |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/onboard.test.js`:
- Around line 2039-2134: The test currently only asserts a recorded "'sandbox'
'delete' 'my-assistant'" command but doesn't verify the recreate step because
childProcess.spawn isn't recorded; update the in-test fixture (the script string
that requires createSandbox) to record spawn invocations (override
childProcess.spawn to push the command into the same commands array as
runner.run does) and then assert payload.commands contains a spawn/create entry
(e.g., command string includes "'sandbox' 'create' 'my-assistant'") in addition
to the existing delete assertion so the test ensures both delete and recreate
occur; reference the createSandbox invocation, the childProcess.spawn override,
and payload.commands when making the changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 756d9712-383a-46fd-91cd-ccabdecdcea7
📒 Files selected for processing (3)
bin/lib/onboard.jsdocs/reference/commands.mdtest/onboard.test.js
✅ Files skipped from review due to trivial changes (1)
- docs/reference/commands.md
🚧 Files skipped from review as they are similar to previous changes (1)
- bin/lib/onboard.js
|
All required checks pass; docstring coverage warning noted, can be handled in a follow-up cleanup PR if needed. |
|
@coderabbitai resume |
✅ Actions performedReviews resumed. |
There was a problem hiding this comment.
♻️ Duplicate comments (1)
test/onboard.test.js (1)
2209-2218:⚠️ Potential issue | 🟡 MinorTest may pass spuriously when
NEMOCLAW_NON_INTERACTIVEis set in CI.The spawn env spreads
process.envwithout explicitly clearingNEMOCLAW_NON_INTERACTIVE. If the test runner sets this variable (common in CI), the spawned script inherits it, skips the prompt entirely, and the test passes without exercising the interactive confirmation path.🛡️ Proposed fix: explicitly clear the env variable
const result = spawnSync(process.execPath, [scriptPath], { cwd: repoRoot, encoding: "utf-8", env: { ...process.env, HOME: tmpDir, PATH: `${fakeBin}:${process.env.PATH || ""}`, + NEMOCLAW_NON_INTERACTIVE: "", }, timeout: 12000, });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/onboard.test.js` around lines 2209 - 2218, The spawned child inherits process.env which can contain NEMOCLAW_NON_INTERACTIVE from CI, causing the interactive branch to be skipped; modify the test to explicitly clear that variable when building the env for spawnSync (change the env passed to spawnSync for the invocation that uses spawnSync/process.execPath/scriptPath/tmpDir/fakeBin to set NEMOCLAW_NON_INTERACTIVE to undefined or remove it from the copied env before adding HOME and PATH) so the child always runs in interactive mode for this test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@test/onboard.test.js`:
- Around line 2209-2218: The spawned child inherits process.env which can
contain NEMOCLAW_NON_INTERACTIVE from CI, causing the interactive branch to be
skipped; modify the test to explicitly clear that variable when building the env
for spawnSync (change the env passed to spawnSync for the invocation that uses
spawnSync/process.execPath/scriptPath/tmpDir/fakeBin to set
NEMOCLAW_NON_INTERACTIVE to undefined or remove it from the copied env before
adding HOME and PATH) so the child always runs in interactive mode for this
test.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0a3868aa-7938-4659-9a92-f9ebb157dd6c
📒 Files selected for processing (3)
bin/lib/onboard.jsdocs/reference/commands.mdtest/onboard.test.js
✅ Files skipped from review due to trivial changes (1)
- docs/reference/commands.md
🚧 Files skipped from review as they are similar to previous changes (1)
- bin/lib/onboard.js
30b365f to
4f5eecf
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
test/onboard.test.js (1)
2784-2793:⚠️ Potential issue | 🟡 MinorClear the recreate/non-interactive env flags for this interactive subprocess.
This case inherits
process.envverbatim, unlike the nearby interactive tests. If the runner already hasNEMOCLAW_NON_INTERACTIVEorNEMOCLAW_RECREATE_SANDBOXset, the confirmation path is skipped and this test no longer covers the decline flow.🔧 Suggested fix
- const result = spawnSync(process.execPath, [scriptPath], { - cwd: repoRoot, - encoding: "utf-8", - env: { - ...process.env, - HOME: tmpDir, - PATH: `${fakeBin}:${process.env.PATH || ""}`, - }, - timeout: 12000, - }); + const env = { + ...process.env, + HOME: tmpDir, + PATH: `${fakeBin}:${process.env.PATH || ""}`, + }; + delete env.NEMOCLAW_NON_INTERACTIVE; + delete env.NEMOCLAW_RECREATE_SANDBOX; + const result = spawnSync(process.execPath, [scriptPath], { + cwd: repoRoot, + encoding: "utf-8", + env, + timeout: 12000, + });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/onboard.test.js` around lines 2784 - 2793, The spawned subprocess in the test (the spawnSync call that runs scriptPath with cwd repoRoot and env built from ...process.env, HOME: tmpDir and PATH: `${fakeBin}:${process.env.PATH || ""}`) unintentionally inherits NEMOCLAW_NON_INTERACTIVE and NEMOCLAW_RECREATE_SANDBOX from the parent; explicitly clear those flags in the env you pass to spawnSync (e.g., set NEMOCLAW_NON_INTERACTIVE and NEMOCLAW_RECREATE_SANDBOX to undefined or an empty string in the env object) so the interactive confirmation path is exercised and the decline flow is tested.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@test/onboard.test.js`:
- Around line 2700-2717: The test "detects provider/model drift and avoids
silent reuse" references payload.commands but never defines payload, causing a
ReferenceError; fix by creating or reusing a payload object with a commands
array before the assertion (e.g., initialize payload = { commands: [] } or
obtain the same payload used in the prior end-to-end case) so the assertion on
payload.commands.some(...) can run; update the test setup near the assertions
that check for "'sandbox' 'create'" so it uses the defined payload variable used
elsewhere in the test suite (or mock the command payload) before the final
assert.ok call.
---
Duplicate comments:
In `@test/onboard.test.js`:
- Around line 2784-2793: The spawned subprocess in the test (the spawnSync call
that runs scriptPath with cwd repoRoot and env built from ...process.env, HOME:
tmpDir and PATH: `${fakeBin}:${process.env.PATH || ""}`) unintentionally
inherits NEMOCLAW_NON_INTERACTIVE and NEMOCLAW_RECREATE_SANDBOX from the parent;
explicitly clear those flags in the env you pass to spawnSync (e.g., set
NEMOCLAW_NON_INTERACTIVE and NEMOCLAW_RECREATE_SANDBOX to undefined or an empty
string in the env object) so the interactive confirmation path is exercised and
the decline flow is tested.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ae5e18ab-9e02-4c3d-ac29-f026a004e2b8
📒 Files selected for processing (3)
bin/lib/onboard.jsdocs/reference/commands.mdtest/onboard.test.js
✅ Files skipped from review due to trivial changes (1)
- docs/reference/commands.md
🚧 Files skipped from review as they are similar to previous changes (1)
- bin/lib/onboard.js
| it("detects provider/model drift and avoids silent reuse", () => { | ||
| const source = fs.readFileSync( | ||
| path.join(import.meta.dirname, "..", "bin", "lib", "onboard.js"), | ||
| "utf-8", | ||
| ); | ||
| assert.match(source, /const selectionDrift = getSelectionDrift\(sandboxName, provider, model\);/); | ||
| assert.match(source, /unknown:\s*true/); | ||
| assert.match(source, /else if \(selectionDrift\.changed\)/); | ||
| assert.match(source, /Recreating sandbox due to provider\/model drift/); | ||
| assert.match( | ||
| source, | ||
| /Sandbox '\$\{sandboxName\}' exists — recreating to apply model\/provider change\./, | ||
| ); | ||
| assert.ok( | ||
| payload.commands.some((entry) => entry.command.includes("'sandbox' 'create'")), | ||
| "expected sandbox recreate when selection config cannot be read", | ||
| ); | ||
| }); |
There was a problem hiding this comment.
payload is undefined here, so this test will fail before it verifies drift handling.
Lines 2713-2716 dereference payload.commands, but this test never creates a payload. That turns the new coverage into a hard ReferenceError and leaves the recreate assertion disconnected from the end-to-end case above.
💡 Suggested fix
it("detects provider/model drift and avoids silent reuse", () => {
const source = fs.readFileSync(
path.join(import.meta.dirname, "..", "bin", "lib", "onboard.js"),
"utf-8",
);
assert.match(source, /const selectionDrift = getSelectionDrift\(sandboxName, provider, model\);/);
assert.match(source, /unknown:\s*true/);
assert.match(source, /else if \(selectionDrift\.changed\)/);
assert.match(source, /Recreating sandbox due to provider\/model drift/);
assert.match(
source,
/Sandbox '\$\{sandboxName\}' exists — recreating to apply model\/provider change\./,
);
- assert.ok(
- payload.commands.some((entry) => entry.command.includes("'sandbox' 'create'")),
- "expected sandbox recreate when selection config cannot be read",
- );
});🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/onboard.test.js` around lines 2700 - 2717, The test "detects
provider/model drift and avoids silent reuse" references payload.commands but
never defines payload, causing a ReferenceError; fix by creating or reusing a
payload object with a commands array before the assertion (e.g., initialize
payload = { commands: [] } or obtain the same payload used in the prior
end-to-end case) so the assertion on payload.commands.some(...) can run; update
the test setup near the assertions that check for "'sandbox' 'create'" so it
uses the defined payload variable used elsewhere in the test suite (or mock the
command payload) before the final assert.ok call.
cv
left a comment
There was a problem hiding this comment.
Security Review — WARNING (3 issues)
The drift-detection approach is sound and the interactive confirmation is good. But three issues need fixing.
Required changes
1. Unknown config should not auto-destroy (HIGH):
When the config file can't be read (drift.unknown === true), non-interactive mode unconditionally returns true (recreate). A transient download failure or unreachable sandbox would destroy a perfectly good sandbox. Default to reuse when config is unreadable in non-interactive mode, or require NEMOCLAW_RECREATE_SANDBOX=1 explicitly.
2. Not-ready sandbox guard dropped (HIGH):
The old code had explicit handling for existingSandboxState !== "ready" — it prompted in interactive mode or exited in non-interactive mode. The new code falls straight through to delete-and-recreate with no prompt. This is a behavioral regression. Restore the not-ready guard.
3. Broken test — payload undefined (MEDIUM):
The "detects provider/model drift" test references payload.commands.some(...) but payload is never defined in that scope. Likely copy-paste from the "unreadable config" test. Will throw ReferenceError.
What's good
- Interactive confirmation guard is correct
- Config file only stores provider/model (no secrets)
- Temp directory handling with
finallycleanup is clean stdio: ["ignore", ...]suppresses download output
Summary
Fixes same-name re-onboard behavior where OpenClaw UI could keep showing a stale model/provider.
On onboarding with an existing ready sandbox, we now detect provider/model drift and avoid silent reuse by triggering recreate flow (with an interactive safety confirmation).
Related Issue
Fixes #1457
Changes
openshell sandbox download(/sandbox/.nemoclaw/config.json).test/onboard.test.jscovering:Type of Change
Testing
npx prek run --all-filespasses (or equivalentlymake check).npm testpasses.make docsbuilds without warnings. (for doc-only changes)Checklist
General
Code Changes
npx prek run --all-filesauto-fixes formatting (ormake formatfor targeted runs).Doc Changes
update-docsagent skill to draft changes while complying with the style guide. For example, prompt your agent with "/update-docscatch up the docs for the new changes I made in this PR."Signed-off-by: Revant Patel revant.h.patel@gmail.com
Summary by CodeRabbit
New Features
Tests
Documentation