fix(openai-compat): surface availability state for OpenAI-compatible backends by petersimmons1972 · Pull Request #149 · thushan/olla

petersimmons1972 · 2026-05-20T18:35:27Z

Summary

OpenAI-compatible backends (vLLM, llama.cpp, Infinity, sglang, lmdeploy, etc.) always show state: "unknown" in /olla/models, even when they are healthy, discovered, and actively serving traffic. This makes /olla/models unusable as a model-availability signal for clients that filter on availability[].state == "available".

Reproduction

Configure Olla with at least one OpenAI-compatible endpoint (e.g., a vLLM server at http://host:8000).
Wait for model discovery to complete (Olla logs confirm models discovered, requests route successfully).
curl http://olla/olla/models.
Observe availability[].state: "unknown" for every endpoint, indefinitely.

Root cause

There are two cooperating bugs:

Bug 1 — openAIParser never populates state-inferring fields. The standard OpenAI /v1/models response contains only id, object, created, owned_by — no size, no state. openAIParser.Parse in internal/adapter/registry/profile/parsers.go therefore leaves modelInfo.Size = 0 and never writes metadata["state"]. ModelExtractor.MapModelState (in internal/adapter/unifier/model_builder.go) checks metadata["state"], then metadata["loaded"], then modelSize > 0, and falls through to return "unknown". For every OpenAI-compatible backend, the fall-through is the only branch ever taken.

Bug 2 — converter reads stale string field, not effective state. UnifiedConverter.convertModel in internal/adapter/converter/unified_converter.go reads ep.State directly. SourceEndpoint has two parallel state fields: State (legacy string set once at discovery) and ModelState (typed enum updated by the lifecycle unifier). The lifecycle unifier writes only to ModelState, so any health-driven transitions never surface in the API response. The domain already provides SourceEndpoint.GetEffectiveState() which consults both fields and normalises legacy strings — the converter just wasn't using it.

Fix

Two small changes:

internal/adapter/registry/profile/parsers.go — In openAIParser.Parse, set modelInfo.Size = 1 as a sentinel for any successfully-discovered model. For OpenAI-compatible backends, presence in the /v1/models response IS the availability signal — these servers only list models that are loaded and ready to serve. MapModelState then returns "available" via the existing modelSize > 0 branch.
internal/adapter/converter/unified_converter.go — Replace State: ep.State with State: string(ep.GetEffectiveState()) so the converter consults both the typed state machine and the legacy string field, with the existing fallback semantics defined in SourceEndpoint.GetEffectiveState().

Why these changes are safe

The Size sentinel is only consumed by MapModelState (for the unknown/available branch) and by parameter-count estimation; neither makes behavioural assumptions about absolute byte values that a sentinel of 1 would violate.
GetEffectiveState() already exists, is used in tests, and falls through to ModelStateUnknown when neither field has data — so the previous behaviour is preserved for genuinely unknown endpoints.
No schema change, no migration, no config flag.

Test plan

go test ./... — all packages pass (28 test packages, zero failures)
Built local Docker image, ran against production-like config with vLLM, llama-cpp, and Infinity endpoints — all three reported state: "available" after the patch (state: "unknown" before)
Reviewer to confirm Ollama backends (which DO populate size/state natively) are unaffected — these flow through ollamaParser, not openAIParser, and the converter change reads GetEffectiveState() which preserves Ollama's "loaded"/"not-loaded" semantics via the existing switch in SourceEndpoint.GetEffectiveState().

Summary by CodeRabbit

Bug Fixes
- Improved endpoint availability state detection for more accurate status reporting across integrations.
- Fixed model availability recognition for OpenAI-compatible backends during model discovery to prevent models being incorrectly marked as unavailable.

…backends OpenAI-compatible /v1/models endpoints (vllm, llama.cpp, Infinity, etc.) return only id/object/created/owned_by — no size, no state field. As a result, openAIParser left model.Size = 0 and metadata["state"] unset, so MapModelState() always fell through to "unknown" for these backends. Meanwhile, the unified converter read ep.State (the legacy string set once at discovery and never transitioned), not ep.GetEffectiveState() which consults the typed ModelState field populated by the lifecycle unifier. Combined, endpoints that successfully routed traffic still reported state: "unknown" forever in /olla/models. Two-part fix: - openAIParser: set Size = 1 sentinel. For OpenAI-compat backends, presence in the discovery response IS the availability signal — these servers only list models that are loaded and ready to serve. - unified_converter: read string(ep.GetEffectiveState()) instead of ep.State, so health-driven transitions and the typed state machine both surface in the API response. All tests pass: go test ./... Co-Authored-By: Claude Opus 4.7 <[email protected]>

coderabbitai · 2026-05-20T18:35:39Z

Walkthrough

The PR modifies model discovery and availability handling in two coordinated places. The OpenAI parser now assigns a sentinel Size: 1 value to prevent models from being treated as unavailable, and the unified converter shifts to deriving endpoint state from a typed lifecycle method rather than a legacy string field.

Changes

Model availability state normalisation

Layer / File(s)	Summary
OpenAI parser sentinel size `internal/adapter/registry/profile/parsers.go`	`openAIParser.Parse` sets `Size: 1` when building `domain.ModelInfo` for OpenAI-compatible models. Comments note that these backends lack size/state reporting and rely on presence as the availability signal.
Converter state derivation `internal/adapter/converter/unified_converter.go`	`convertModel` now populates endpoint `availability.state` from `ep.GetEffectiveState()` (stringified) instead of the legacy `ep.State` field, with added documentation describing the typed lifecycle transition and normalisation of legacy values.

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly and concisely describes the main fix: surfacing availability state for OpenAI-compatible backends, which directly aligns with the primary changes addressing the state reporting bug.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/adapter/converter/unified_converter.go`:
- Around line 77-85: convertModel builds availability using
ep.GetEffectiveState() but matchesAvailabilityFilter still reads the legacy
ep.State, causing inconsistent filtering vs the returned availability.State;
update matchesAvailabilityFilter to use ep.GetEffectiveState() (or the
normalized typed value it returns) instead of ep.State so the
/olla/models?available=... filter aligns with the availability entries produced
by convertModel; locate matchesAvailabilityFilter and change its state-check
logic to call ep.GetEffectiveState() (or compare against the same enum/string
used when creating EndpointStatus) so both filtering and payload use the same
effective state source.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d6f7ebf1-17f6-42bf-9e29-c7b6898716bf

📥 Commits

Reviewing files that changed from the base of the PR and between 6d6ac4d and 1017200.

📒 Files selected for processing (2)

internal/adapter/converter/unified_converter.go
internal/adapter/registry/profile/parsers.go

coderabbitai · 2026-05-20T18:37:20Z

+		// Use GetEffectiveState() rather than ep.State directly: the lifecycle
+		// unifier updates the typed ModelState field, while ep.State (the legacy
+		// string) is only set at discovery time and never transitions. Reading
+		// the effective state ensures health-driven transitions surface in the
+		// API response. GetEffectiveState() also normalises legacy string values
+		// ("loaded", "not-loaded", "available") to the typed enum.
 		availability = append(availability, EndpointStatus{
 			Endpoint: ep.EndpointName, // Use endpoint name instead of URL
-			State:    ep.State,
+			State:    string(ep.GetEffectiveState()),


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep availability filtering aligned with effective state

convertModel now uses effective state, but matchesAvailabilityFilter still checks legacy state (ep.State) at Line 172. This can produce inconsistent /olla/models?available=... results versus the availability.state returned in the payload.

Suggested fix

func matchesAvailabilityFilter(model *domain.UnifiedModel, available *bool) bool { if available == nil { return true } isAvailable := false for _, ep := range model.SourceEndpoints { - if ep.State == "loaded" { + if string(ep.GetEffectiveState()) == "loaded" { isAvailable = true break } } return *available == isAvailable }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Use GetEffectiveState() rather than ep.State directly: the lifecycle

// unifier updates the typed ModelState field, while ep.State (the legacy

// string) is only set at discovery time and never transitions. Reading

// the effective state ensures health-driven transitions surface in the

// API response. GetEffectiveState() also normalises legacy string values

// ("loaded", "not-loaded", "available") to the typed enum.

availability = append(availability, EndpointStatus{

Endpoint: ep.EndpointName, // Use endpoint name instead of URL

State: ep.State,

State: string(ep.GetEffectiveState()),

func matchesAvailabilityFilter(model *domain.UnifiedModel, available *bool) bool {

if available == nil {

return true

}

isAvailable := false

for _, ep := range model.SourceEndpoints {

if string(ep.GetEffectiveState()) == "loaded" {

isAvailable = true

break

}

}

return *available == isAvailable

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/adapter/converter/unified_converter.go` around lines 77 - 85, convertModel builds availability using ep.GetEffectiveState() but matchesAvailabilityFilter still reads the legacy ep.State, causing inconsistent filtering vs the returned availability.State; update matchesAvailabilityFilter to use ep.GetEffectiveState() (or the normalized typed value it returns) instead of ep.State so the /olla/models?available=... filter aligns with the availability entries produced by convertModel; locate matchesAvailabilityFilter and change its state-check logic to call ep.GetEffectiveState() (or compare against the same enum/string used when creating EndpointStatus) so both filtering and payload use the same effective state source.

petersimmons1972 requested a review from thushan as a code owner May 20, 2026 18:35

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(openai-compat): surface availability state for OpenAI-compatible backends#149

fix(openai-compat): surface availability state for OpenAI-compatible backends#149
petersimmons1972 wants to merge 1 commit into
thushan:mainfrom
petersimmons1972:fix/openai-compat-state-unknown

petersimmons1972 commented May 20, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 20, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

petersimmons1972 commented May 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Reproduction

Root cause

Fix

Why these changes are safe

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

petersimmons1972 commented May 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 20, 2026 •

edited

Loading