fix(cli): show inference health in sandbox status output by ericksoa · Pull Request #2002 · NVIDIA/NemoClaw

ericksoa · 2026-04-17T04:10:49Z

Summary

Adds remote provider health probing to nemoclaw <name> status so all providers (not just local) show an Inference line
Local probing (vllm-local, ollama-local) already worked — this fills the gap for remote providers (nvidia-prod, openai-api, anthropic-prod, gemini-api)
Creates a unified probeProviderHealth() dispatcher in new inference-health.ts module that handles both local and remote providers
Remote probes use lightweight reachability checks (any HTTP response including 401/403 = reachable, no API keys sent)
compatible-* providers show "not probed" since their endpoint URLs aren't known

Fixes #995

Test plan

23 new unit tests in inference-health.test.ts covering endpoint mapping, reachability semantics, timeouts, and unified dispatch
All 1832 existing tests continue to pass
Manual: nemoclaw <sandbox> status with a remote provider shows new Inference line
Manual: nemoclaw <sandbox> status with a local provider output is unchanged

Summary by CodeRabbit

New Features
- Added unified health checking for inference providers, supporting both local and remote provider monitoring.
- Enhanced status reporting with more granular health states and detailed diagnostics.
Tests
- Added comprehensive test coverage for inference provider health probing and endpoint configuration.

sandboxStatus() already probed local providers (vllm-local, ollama-local) but showed no Inference line for remote providers. Add a unified probeProviderHealth() dispatcher that performs lightweight reachability checks for remote cloud endpoints (nvidia-prod, openai-api, anthropic-prod, gemini-api) and a "not probed" fallback for compatible-* providers whose URLs are unknown. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-17T04:11:02Z

📝 Walkthrough

Walkthrough

Introduces a unified inference-provider health-probing layer with functions to map providers to endpoints, probe remote providers via curl with timeout control, and delegate between local and remote probing strategies. Includes comprehensive test coverage and integrates the new health-probing API into nemoclaw status reporting.

Changes

Cohort / File(s)	Summary
Health Probing Infrastructure `src/lib/inference-health.ts`, `src/lib/inference-health.test.ts`	New unified health-probing module exporting `probeProviderHealth`, `probeRemoteProviderHealth`, and `getRemoteProviderHealthEndpoint`. Implements provider-to-endpoint mapping, curl-based reachability checking with 3s connect and 5s max timeouts, and delegation logic. Treats HTTP 401/403 as reachable. Special-cases compatible endpoints as "not probed". Comprehensive test suite validates provider mapping, curl integration, probe outcomes for reachable/unreachable cases, and timeout/error handling.
Integration & Usage `src/nemoclaw.ts`	Updated `sandboxStatus` to replace local-only health probing with unified `probeProviderHealth` call. Enhanced `Inference:` status reporting to distinguish three states: "not probed" (when `probed: false`), "healthy" (when `ok: true`), and "unreachable" (when `ok: false`), with detail output on probe failure.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant probeProviderHealth
    participant probeLocalProviderHealth
    participant probeRemoteProviderHealth
    participant getRemoteProviderHealthEndpoint
    participant curl as runCurlProbeImpl<br/>(curl probe)

    Caller->>probeProviderHealth: provider, options
    
    alt Local Provider
        probeProviderHealth->>probeLocalProviderHealth: Attempt local probe
        probeLocalProviderHealth-->>probeProviderHealth: ProviderHealthStatus | null
    else Remote Provider
        probeProviderHealth->>probeRemoteProviderHealth: Delegate to remote
        probeRemoteProviderHealth->>getRemoteProviderHealthEndpoint: Map provider to endpoint
        getRemoteProviderHealthEndpoint-->>probeRemoteProviderHealth: endpoint URL | null
        
        alt Compatible Endpoint
            probeRemoteProviderHealth-->>probeProviderHealth: {probed: false, ok: true}
        else Remote Endpoint Found
            probeRemoteProviderHealth->>curl: curl with timeouts + endpoint
            curl-->>probeRemoteProviderHealth: CurlProbeResult
            probeRemoteProviderHealth-->>probeProviderHealth: {probed: true, ok: boolean}
        end
    else Unknown Provider
        probeProviderHealth-->>Caller: null
    end
    
    probeProviderHealth-->>Caller: ProviderHealthStatus | null

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 A probe hops through the provider maze,
Checking endpoints in curious ways,
Local or remote, it finds the right path,
Curl whispers secrets, health in its grasp,
Now nemoclaw knows when all's okay! 🌟

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely describes the main change: adding inference health visibility to the sandbox status command output.
Linked Issues check	✅ Passed	The PR addresses issue `#995` by implementing unified provider health probing for both local and remote providers, improving visibility of inference backend health through the status command.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to implementing inference health probing: new test suite, health probing module, and integration into status command. No unrelated modifications detected.
Docstring Coverage	✅ Passed	Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/995-status-inference-health

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

src/nemoclaw.ts (1)

1214-1226: Extract inference rendering to keep sandboxStatus complexity in check.

Line 1200’s function is already complexity-suppressed, and this new branch block adds more decision paths. Consider moving this rendering logic to a small helper.

♻️ Proposed refactor

+function printInferenceHealthStatus(inferenceHealth) {
+  if (!inferenceHealth) return;
+  if (!inferenceHealth.probed) {
+    console.log(`    Inference: ${D}not probed${R} (${inferenceHealth.detail})`);
+    return;
+  }
+  if (inferenceHealth.ok) {
+    console.log(`    Inference: ${G}healthy${R} (${inferenceHealth.endpoint})`);
+    return;
+  }
+  console.log(`    Inference: ${_RD}unreachable${R} (${inferenceHealth.endpoint})`);
+  console.log(`      ${inferenceHealth.detail}`);
+}
...
-    if (inferenceHealth) {
-      if (!inferenceHealth.probed) {
-        console.log(`    Inference: ${D}not probed${R} (${inferenceHealth.detail})`);
-      } else if (inferenceHealth.ok) {
-        console.log(
-          `    Inference: ${G}healthy${R} (${inferenceHealth.endpoint})`,
-        );
-      } else {
-        console.log(
-          `    Inference: ${_RD}unreachable${R} (${inferenceHealth.endpoint})`,
-        );
-        console.log(`      ${inferenceHealth.detail}`);
-      }
-    }
+    printInferenceHealthStatus(inferenceHealth);

As per coding guidelines, **/*.{js,ts,tsx,jsx}: Limit cyclomatic complexity to 20 in JavaScript/TypeScript files, with target of 15.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/nemoclaw.ts` around lines 1214 - 1226, The inference rendering block
inside the sandboxStatus function is increasing cyclomatic complexity; extract
it into a small helper named something like
renderInferenceHealth(inferenceHealth) that takes the existing inferenceHealth
object and the color constants (D, R, G, _RD) and returns or prints the exact
same lines (handle !probed, ok, and unreachable cases including detail and
endpoint) and replace the inline branch in sandboxStatus with a single call to
that helper to preserve behavior and reduce complexity.

src/lib/inference-health.ts (1)

92-95: Prefer not probed over null for recognized-but-unmapped providers.

If a provider is recognized by config but missing endpoint mapping, returning null drops the Inference line entirely. Returning a probed: false status is safer and keeps output stable as providers evolve.

♻️ Proposed refactor

   const endpoint = getRemoteProviderHealthEndpoint(provider);
   if (!endpoint) {
-    return null;
+    if (config) {
+      return {
+        ok: true,
+        probed: false,
+        providerLabel,
+        endpoint: "",
+        detail: "Health probe endpoint is not defined for this provider.",
+      };
+    }
+    return null;
   }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/lib/inference-health.ts` around lines 92 - 95, The current code in
inference-health.ts calls getRemoteProviderHealthEndpoint(provider) and returns
null when endpoint is missing, which removes the provider from output; change
the behavior so that when endpoint is falsy you return an object indicating the
provider is recognized but not probed (e.g., { provider, probed: false, status:
'not probed' } or matching the existing Inference/Health shape) instead of null.
Update the branch that checks `if (!endpoint)` (the code referencing endpoint
from getRemoteProviderHealthEndpoint) to construct and return the non-probed
status object so downstream consumers still see the provider entry.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/lib/inference-health.ts`:
- Around line 92-95: The current code in inference-health.ts calls
getRemoteProviderHealthEndpoint(provider) and returns null when endpoint is
missing, which removes the provider from output; change the behavior so that
when endpoint is falsy you return an object indicating the provider is
recognized but not probed (e.g., { provider, probed: false, status: 'not probed'
} or matching the existing Inference/Health shape) instead of null. Update the
branch that checks `if (!endpoint)` (the code referencing endpoint from
getRemoteProviderHealthEndpoint) to construct and return the non-probed status
object so downstream consumers still see the provider entry.

In `@src/nemoclaw.ts`:
- Around line 1214-1226: The inference rendering block inside the sandboxStatus
function is increasing cyclomatic complexity; extract it into a small helper
named something like renderInferenceHealth(inferenceHealth) that takes the
existing inferenceHealth object and the color constants (D, R, G, _RD) and
returns or prints the exact same lines (handle !probed, ok, and unreachable
cases including detail and endpoint) and replace the inline branch in
sandboxStatus with a single call to that helper to preserve behavior and reduce
complexity.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: b11baa0d-7e08-4b7b-a922-78b0e2db8c65

📥 Commits

Reviewing files that changed from the base of the PR and between 56ee83f and efd5a8f.

📒 Files selected for processing (3)

src/lib/inference-health.test.ts
src/lib/inference-health.ts
src/nemoclaw.ts

coderabbitai bot reviewed Apr 17, 2026

View reviewed changes

Merge branch 'main' into fix/995-status-inference-health

e723a1a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cli): show inference health in sandbox status output#2002

fix(cli): show inference health in sandbox status output#2002
ericksoa wants to merge 2 commits intomainfrom
fix/995-status-inference-health

ericksoa commented Apr 17, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Apr 17, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ericksoa commented Apr 17, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ericksoa commented Apr 17, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Apr 17, 2026 •

edited

Loading