-
Couldn't load subscription status.
- Fork 1.2k
Description
Bug Description
Llama Stack was failing to route inference requests when using the provider_id/model_id format for models that weren't explicitly registered in the routing table, even though the provider was properly configured.
Expected Behavior
When a model ID follows the provider_id/model_id format (e.g., anthropic/claude-sonnet-3-5) and the provider is configured, the request should route to that provider regardless of whether the specific model is pre-registered. This is essential for remote providers where users provide their own API keys via provider_data.
Actual Behavior
The router would immediately raise ModelNotFoundError if the model wasn't in the routing table, without attempting to extract and route to the provider.
Impact
This bug prevented users from:
- Using remote providers with user-supplied API keys
- Accessing newly released models without updating distribution configs
- Dynamically routing to provider-supported models
Example That Now Works
from llama_stack_client import LlamaStackClient
client = LlamaStackClient(base_url="http://localhost:5000")
# This would previously fail with ModelNotFoundError
# Now routes correctly to the anthropic provider
response = client.chat.completions.create(
model="anthropic/claude-sonnet-3-5",
messages=[{"role": "user", "content": "Hello!"}],
extra_body={
"provider_data": {
"anthropic": {
"api_key": "your-anthropic-api-key"
}
}
}
)Fix
The inference router now:
- Attempts to look up the model in the routing table
- If not found, parses the model ID as
provider_id/model_id - Routes directly to the provider if it exists, passing the provider-specific model ID
This restores the expected behavior where the provider_id/model_id convention works consistently throughout the system.
Fixed In
PR #3928