Skip to content

fix(onboard): NVIDIA Endpoints model selection has 4 broken probe behaviors causing misleading errors and hung wizard #1601

@latenighthackathon

Description

@latenighthackathon

Summary

Three distinct issues in the NVIDIA Endpoints (nvidia-prod) onboarding probe path make custom model selection from the public NVIDIA Build catalog look broken even when the model would otherwise work.

The pre-listed Nemotron models (the default NemoClaw onboarding path) work fine today. These bugs only bite users who pick a custom model ID from https://integrate.api.nvidia.com/v1/models. Bugs 2 and 3 are the actual user-impact issues; Bug 1 is cosmetic noise that only becomes visible on already-failing models.

Reported by @kaviin27 in the NVIDIA Developer Discord.

Reporter's symptoms

Setting the model to mistralai/mistral-medium-3-instruct (or other IDs from https://integrate.api.nvidia.com/v1/models):

  • Responses API with tool calling: HTTP 404: 404 page not found
  • Chat Completions API: HTTP 404: Function 'ZZZZ': Not found for account 'XXXX'

Pre-listed Nemotron 120B models work fine.

Live reproduction (against https://integrate.api.nvidia.com/v1 with a free NVIDIA Build key)

Probe Result Notes
GET /v1/models/mistralai/mistral-medium-3-instruct 200 model exists in catalog
POST /v1/chat/completions (mistral-medium-3-instruct) timeout >60s, no bytes NVIDIA backend hangs
POST /v1/chat/completions (mistral-large) 404 Function 'XXX': Not found for account 'YYY' NVCF deployment missing for account
POST /v1/chat/completions (mistral-7b-instruct-v0.3) 200 in 0.64s works
POST /v1/chat/completions (mistral-small-3.1-24b-instruct-2503) 200 in 0.55s works
POST /v1/chat/completions (nvidia/llama-3.3-nemotron-super-49b-v1) 200 in 0.67s control
POST /v1/responses (any model, including bogus IDs) 404 page not found endpoint does not exist on NVIDIA Build at all
POST /v1/chat/completions (bogus/does-not-exist) 404 page not found distinct from the NVCF "Function not found" string

Status

All three bugs are fixed in PR #1602.

Bug 1 (cosmetic): Responses API probe fires on every NVIDIA Build model and always returns 404

  • Code: bin/lib/onboard.js:997-1001shouldRequireResponsesToolCalling() returns true for nvidia-prod
  • NVIDIA Build does not expose /v1/responses for any model. Confirmed by sending the probe to bogus model IDs and even to nvidia/llama-3.3-nemotron-super-49b-v1 — same 404 page not found every time.
  • The probe loop in bin/lib/onboard.js:1101-1113 silently falls through to the Chat Completions probe on failure, so working models like Nemotron still onboard successfully — the 404 is only user-visible when chat completions also fails. That's why @kaviin27's error message had both halves.
  • Severity: cosmetic noise. Skipping a known-404 probe removes the misleading first half of failure messages and makes the real error easier to read.

Bug 2: NVCF "Function not found for account" is not detected and the user has no actionable message

  • Code: bin/lib/onboard.js:1115-1119probeOpenAiLikeEndpoint() joins all probe failures into one raw string
  • Many catalog models (e.g. mistral-large) return 404 Function '<uuid>': Not found for account '<account-id>'. This is NVCF's way of saying "the model is in the catalog but not deployed for your account/org".
  • The raw NVCF error gets surfaced with no explanation, so users assume NemoClaw is broken.
  • Suggested fix: detect the Function .*Not found for account pattern in chat-completions probe failures and reframe with a clear message like:

    Model <id> is in the NVIDIA Build catalog but is not deployed for your account. Pick a different model, or check the model card on https://build.nvidia.com to see if it requires org-level access.

Bug 3: Probes have no per-request timeout, so a hung NVIDIA backend hangs onboard forever

  • Confirmed: chat completions for mistralai/mistral-medium-3-instruct consistently hangs 30+ seconds with zero bytes returned (3 attempts, all timed out).
  • runCurlProbe's curl invocation needs a --max-time so a misbehaving model doesn't lock up the wizard. Recommend ~15s for the validation probe.

Repro command (anyone can run with their own key)

KEY="<your nvapi-... key>"
echo "=== broken model that hangs ==="
curl -sS -m 30 -w "\nHTTP %{http_code}\n" \
  https://integrate.api.nvidia.com/v1/chat/completions \
  -H "Authorization: Bearer $KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"mistralai/mistral-medium-3-instruct","messages":[{"role":"user","content":"hi"}],"max_tokens":1}'

echo "=== broken model that NVCF-404s ==="
curl -sS -m 30 -w "\nHTTP %{http_code}\n" \
  https://integrate.api.nvidia.com/v1/chat/completions \
  -H "Authorization: Bearer $KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"mistralai/mistral-large","messages":[{"role":"user","content":"hi"}],"max_tokens":1}'

echo "=== responses API 404 (all models) ==="
curl -sS -m 15 -w "\nHTTP %{http_code}\n" \
  https://integrate.api.nvidia.com/v1/responses \
  -H "Authorization: Bearer $KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"nvidia/llama-3.3-nemotron-super-49b-v1","input":"hi"}'

Metadata

Metadata

Assignees

No one assigned

    Labels

    Getting StartedUse this label to identify setup, installation, or onboarding issues.NemoClaw CLIUse this label to identify issues with the NemoClaw command-line interface (CLI).Provider: NVIDIAbugSomething isn't workingpriority: highImportant issue that should be resolved in the next release

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions