Skip to content

Allow provider connections to be dynamically managed #3809

@raghotham

Description

@raghotham

Add runtime provider management to llama-stack, enabling users to register, update, and remove providers via API without server restarts. Providers support multi-instance deployments and full ABAC integration for secure multi-tenant operations.

Current State: Providers must be configured in run.yaml and require server restart for any changes.

Problems:

  • Cannot add/update API keys without downtime
  • Multi-tenant deployments require complex pre-configuration
  • Testing different provider configurations is cumbersome
  • No runtime provider health monitoring

Solution: Dynamic provider management with persistence, access control, and hot-reload.

Core Features

1. CRUD Operations via API

# Register inference provider (OpenAI)
POST /providers
{
  "provider_id": "openai-team-a",
  "api": "inference",
  "provider_type": "remote::openai",
  "config": {"api_key": "sk-..."},
  "attributes": {"teams": ["team-a"]}
}

# Register vector store provider (Qdrant)
POST /providers
{
  "provider_id": "qdrant-prod",
  "api": "vector_io",
  "provider_type": "remote::qdrant",
  "config": {"url": "http://qdrant:6333", "api_key": "..."},
  "attributes": {"environments": ["production"]}
}

# Register safety provider (Llama Guard)
POST /providers
{
  "provider_id": "llama-guard-v3",
  "api": "safety",
  "provider_type": "inline::llama-guard",
  "config": {"model": "meta-llama/Llama-Guard-3-8B"},
  "attributes": {"teams": ["safety-team"]}
}

# Update config (hot-reload)
PUT /providers/openai-team-a
{"config": {"api_key": "sk-new-key"}}

# Test connection
POST /providers/openai-team-a/test

# Remove provider
DELETE /providers/openai-team-a

2. Multi-Instance Support

  • Register multiple instances of same provider_type (already supported in run.yaml)
  • Each instance has unique provider_id, independent config, separate lifecycle
  • Use case: Team-specific API keys, environment-specific endpoints

3. ABAC Integration

  • Providers have owner: User and attributes: dict[str, list[str]]
  • Same access control model as resources (models, shields, vector_dbs)
  • Operations (CREATE/READ/UPDATE/DELETE) subject to access_policy rules
  • Example: {"teams": ["ml-team"], "environments": ["production"]}

4. Persistence & Security

  • Stored in existing kvstore: provider_connections:v1::{provider_id}
  • API keys stored unredacted for provider operation
  • Automatic redaction in HTTP responses ("***REDACTED***")
  • Providers persist across server restarts

Example Use Cases

1. Multi-Tenant SaaS

# Each customer gets isolated inference and vector store
POST /providers -d '{"provider_id": "openai-customer-123", "api": "inference",
  "provider_type": "remote::openai", "attributes": {"customer_id": ["123"]}}'
POST /providers -d '{"provider_id": "pgvector-customer-123", "api": "vector_io",
  "provider_type": "remote::pgvector", "attributes": {"customer_id": ["123"]}}'

2. API Key Rotation

# Update OpenAI or Qdrant keys without restart
PUT /providers/my-openai -d '{"config": {"api_key": "sk-new-key"}}'
PUT /providers/qdrant-prod -d '{"config": {"api_key": "qdrant-new-key"}}'

3. Environment Separation

# Different vector stores per environment
POST /providers -d '{"provider_id": "faiss-dev", "api": "vector_io",
  "provider_type": "inline::faiss", "attributes": {"env": ["dev"]}}'
POST /providers -d '{"provider_id": "qdrant-prod", "api": "vector_io",
  "provider_type": "remote::qdrant", "attributes": {"env": ["prod"]}}'

4. Team-Specific Safety Policies

# Different teams with different Llama Guard configurations
POST /providers -d '{"provider_id": "guard-team-a", "api": "safety",
  "provider_type": "inline::llama-guard", "config": {"model": "Llama-Guard-3-8B"},
  "attributes": {"teams": ["team-a"], "risk_tolerance": ["strict"]}}'
POST /providers -d '{"provider_id": "guard-team-b", "api": "safety",
  "provider_type": "inline::llama-guard", "config": {"model": "Llama-Guard-3-1B"},
  "attributes": {"teams": ["team-b"], "risk_tolerance": ["moderate"]}}'

💡 Why is this needed? What if we don't build it?

see description

Other thoughts

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions