-
Couldn't load subscription status.
- Fork 1.2k
Description
Add runtime provider management to llama-stack, enabling users to register, update, and remove providers via API without server restarts. Providers support multi-instance deployments and full ABAC integration for secure multi-tenant operations.
Current State: Providers must be configured in run.yaml and require server restart for any changes.
Problems:
- Cannot add/update API keys without downtime
- Multi-tenant deployments require complex pre-configuration
- Testing different provider configurations is cumbersome
- No runtime provider health monitoring
Solution: Dynamic provider management with persistence, access control, and hot-reload.
Core Features
1. CRUD Operations via API
# Register inference provider (OpenAI)
POST /providers
{
"provider_id": "openai-team-a",
"api": "inference",
"provider_type": "remote::openai",
"config": {"api_key": "sk-..."},
"attributes": {"teams": ["team-a"]}
}
# Register vector store provider (Qdrant)
POST /providers
{
"provider_id": "qdrant-prod",
"api": "vector_io",
"provider_type": "remote::qdrant",
"config": {"url": "http://qdrant:6333", "api_key": "..."},
"attributes": {"environments": ["production"]}
}
# Register safety provider (Llama Guard)
POST /providers
{
"provider_id": "llama-guard-v3",
"api": "safety",
"provider_type": "inline::llama-guard",
"config": {"model": "meta-llama/Llama-Guard-3-8B"},
"attributes": {"teams": ["safety-team"]}
}
# Update config (hot-reload)
PUT /providers/openai-team-a
{"config": {"api_key": "sk-new-key"}}
# Test connection
POST /providers/openai-team-a/test
# Remove provider
DELETE /providers/openai-team-a2. Multi-Instance Support
- Register multiple instances of same
provider_type(already supported inrun.yaml) - Each instance has unique
provider_id, independent config, separate lifecycle - Use case: Team-specific API keys, environment-specific endpoints
3. ABAC Integration
- Providers have
owner: Userandattributes: dict[str, list[str]] - Same access control model as resources (models, shields, vector_dbs)
- Operations (CREATE/READ/UPDATE/DELETE) subject to
access_policyrules - Example:
{"teams": ["ml-team"], "environments": ["production"]}
4. Persistence & Security
- Stored in existing kvstore:
provider_connections:v1::{provider_id} - API keys stored unredacted for provider operation
- Automatic redaction in HTTP responses (
"***REDACTED***") - Providers persist across server restarts
Example Use Cases
1. Multi-Tenant SaaS
# Each customer gets isolated inference and vector store
POST /providers -d '{"provider_id": "openai-customer-123", "api": "inference",
"provider_type": "remote::openai", "attributes": {"customer_id": ["123"]}}'
POST /providers -d '{"provider_id": "pgvector-customer-123", "api": "vector_io",
"provider_type": "remote::pgvector", "attributes": {"customer_id": ["123"]}}'2. API Key Rotation
# Update OpenAI or Qdrant keys without restart
PUT /providers/my-openai -d '{"config": {"api_key": "sk-new-key"}}'
PUT /providers/qdrant-prod -d '{"config": {"api_key": "qdrant-new-key"}}'3. Environment Separation
# Different vector stores per environment
POST /providers -d '{"provider_id": "faiss-dev", "api": "vector_io",
"provider_type": "inline::faiss", "attributes": {"env": ["dev"]}}'
POST /providers -d '{"provider_id": "qdrant-prod", "api": "vector_io",
"provider_type": "remote::qdrant", "attributes": {"env": ["prod"]}}'4. Team-Specific Safety Policies
# Different teams with different Llama Guard configurations
POST /providers -d '{"provider_id": "guard-team-a", "api": "safety",
"provider_type": "inline::llama-guard", "config": {"model": "Llama-Guard-3-8B"},
"attributes": {"teams": ["team-a"], "risk_tolerance": ["strict"]}}'
POST /providers -d '{"provider_id": "guard-team-b", "api": "safety",
"provider_type": "inline::llama-guard", "config": {"model": "Llama-Guard-3-1B"},
"attributes": {"teams": ["team-b"], "risk_tolerance": ["moderate"]}}'💡 Why is this needed? What if we don't build it?
see description
Other thoughts
No response