llamastack
diff --git a/‎docs/docs/distributions/remote_hosted_distro/oci.md‎
Lines changed: 154 additions & 0 deletions b/‎docs/docs/distributions/remote_hosted_distro/oci.md‎
Lines changed: 154 additions & 0 deletions
diff --git a/‎docs/docs/providers/agents/index.mdx‎
Lines changed: 2 additions & 2 deletions b/‎docs/docs/providers/agents/index.mdx‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/docs/providers/batches/index.mdx‎
Lines changed: 12 additions & 12 deletions b/‎docs/docs/providers/batches/index.mdx‎
Lines changed: 12 additions & 12 deletions
diff --git a/‎docs/docs/providers/eval/index.mdx‎
Lines changed: 2 additions & 2 deletions b/‎docs/docs/providers/eval/index.mdx‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/docs/providers/files/index.mdx‎
Lines changed: 2 additions & 2 deletions b/‎docs/docs/providers/files/index.mdx‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎docs/docs/providers/inference/index.mdx‎
Lines changed: 8 additions & 8 deletions b/‎docs/docs/providers/inference/index.mdx‎
Lines changed: 8 additions & 8 deletions
diff --git a/‎docs/docs/providers/inference/remote_oci.mdx‎
Lines changed: 48 additions & 0 deletions b/‎docs/docs/providers/inference/remote_oci.mdx‎
Lines changed: 48 additions & 0 deletions
diff --git a/‎docs/docs/providers/safety/index.mdx‎
Lines changed: 2 additions & 2 deletions b/‎docs/docs/providers/safety/index.mdx‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎llama_stack/distributions/oci/__init__.py‎
Lines changed: 9 additions & 0 deletions b/‎llama_stack/distributions/oci/__init__.py‎
Lines changed: 9 additions & 0 deletions
diff --git a/‎llama_stack/distributions/oci/build.yaml‎
Lines changed: 34 additions & 0 deletions b/‎llama_stack/distributions/oci/build.yaml‎
Lines changed: 34 additions & 0 deletions
@@ -0,0 +1,154 @@
+---
+orphan: true
+---
+<!-- This file was auto-generated by distro_codegen.py, please edit source -->
+# OCI Distribution
+
+The `llamastack/distribution-oci` distribution consists of the following provider configurations.
+
+| API | Provider(s) |
+|-----|-------------|
+| agents | `inline::meta-reference` |
+| datasetio | `remote::huggingface`, `inline::localfs` |
+| eval | `inline::meta-reference` |
+| files | `inline::localfs` |
+| inference | `remote::oci` |
+| safety | `inline::llama-guard` |
+| scoring | `inline::basic`, `inline::llm-as-judge`, `inline::braintrust` |
+| tool_runtime | `remote::brave-search`, `remote::tavily-search`, `remote::model-context-protocol` |
+| vector_io | `inline::faiss`, `remote::chromadb`, `remote::pgvector` |
+
+
+### Environment Variables
+
+The following environment variables can be configured:
+
+- `OCI_AUTH_TYPE`: OCI authentication type (instance_principal or config_file) (default: `instance_principal`)
+- `OCI_USER_OCID`: OCI user OCID for authentication (default: ``)
+- `OCI_TENANCY_OCID`: OCI tenancy OCID for authentication (default: ``)
+- `OCI_FINGERPRINT`: OCI API key fingerprint for authentication (default: ``)
+- `OCI_PRIVATE_KEY`: OCI private key for authentication (default: ``)
+- `OCI_REGION`: OCI region (e.g., us-ashburn-1, us-chicago-1, us-phoenix-1, eu-frankfurt-1) (default: ``)
+- `OCI_COMPARTMENT_OCID`: OCI compartment ID for the Generative AI service (default: ``)
+- `OCI_CONFIG_FILE_PATH`: OCI config file path (required if OCI_AUTH_TYPE is config_file) (default: `~/.oci/config`)
+- `OCI_CLI_PROFILE`: OCI CLI profile name to use from config file (default: `DEFAULT`)
+
+
+## Prerequisites
+### Oracle Cloud Infrastructure Setup
+
+Before using the OCI Generative AI distribution, ensure you have:
+
+1. **Oracle Cloud Infrastructure Account**: Sign up at [Oracle Cloud Infrastructure](https://cloud.oracle.com/)
+2. **Generative AI Service Access**: Enable the Generative AI service in your OCI tenancy
+3. **Compartment**: Create or identify a compartment where you'll deploy Generative AI models
+4. **Authentication**: Configure authentication using either:
+   - **Instance Principal** (recommended for cloud-hosted deployments)
+   - **API Key** (for on-premises or development environments)
+
+### Authentication Methods
+
+#### Instance Principal Authentication (Recommended)
+Instance Principal authentication allows OCI resources to authenticate using the identity of the compute instance they're running on. This is the most secure method for production deployments.
+
+Requirements:
+- Instance must be running in an Oracle Cloud Infrastructure compartment
+- Instance must have appropriate IAM policies to access Generative AI services
+
+#### API Key Authentication
+For development or on-premises deployments, you can use API key authentication with the following information:
+- User OCID
+- Tenancy OCID
+- API key fingerprint
+- Private key
+- Region
+
+### Required IAM Policies
+
+Ensure your OCI user or instance has the following policy statements:
+
+```
+Allow group <group_name> to use generative-ai-inference-endpoints in compartment <compartment_name>
+Allow group <group_name> to manage generative-ai-inference-endpoints in compartment <compartment_name>
+```
+
+## Supported Services
+
+### Inference: OCI Generative AI
+Oracle Cloud Infrastructure Generative AI provides access to high-performance AI models through OCI's Platform-as-a-Service offering. The service supports:
+
+- **Chat Completions**: Conversational AI with context awareness
+- **Text Generation**: Complete prompts and generate text content
+- **Embeddings**: Convert text to vector embeddings for search and retrieval
+- **Multiple Model Support**: Access to various foundation models including Cohere, Meta, and custom models
+
+#### Available Models
+Common OCI Generative AI models include access to Meta, Cohere, OpenAI, and Grok models.
+
+### Safety: Llama Guard
+For content safety and moderation, this distribution uses Meta's LlamaGuard model through the OCI Generative AI service to provide:
+- Content filtering and moderation
+- Policy compliance checking
+- Harmful content detection
+
+### Vector Storage: Multiple Options
+The distribution supports several vector storage providers:
+- **FAISS**: Local in-memory vector search
+- **ChromaDB**: Distributed vector database
+- **PGVector**: PostgreSQL with vector extensions
+
+### Additional Services
+- **Dataset I/O**: Local filesystem and Hugging Face integration
+- **Tool Runtime**: Web search (Brave, Tavily) and RAG capabilities
+- **Evaluation**: Meta reference evaluation framework
+
+## Running Llama Stack with OCI
+
+You can run the OCI distribution via Docker or local virtual environment.
+
+### Via venv
+
+If you've set up your local development environment, you can also build the image using your local virtual environment.
+
+```bash
+OCI_AUTH=$OCI_AUTH_TYPE OCI_REGION=$OCI_REGION OCI_COMPARTMENT_OCID=$OCI_COMPARTMENT_OCID llama stack run --port 8321 oci
+```
+
+### Configuration Examples
+
+#### Using Instance Principal (Recommended for Production)
+```bash
+export OCI_AUTH_TYPE=instance_principal
+export OCI_REGION=us-chicago-1
+export OCI_COMPARTMENT_OCID=ocid1.compartment.oc1..<your-compartment-id>
+```
+
+#### Using API Key Authentication (Development)
+```bash
+export OCI_AUTH_TYPE=config_file
+export OCI_CONFIG_FILE_PATH=~/.oci/config
+export OCI_CLI_PROFILE=DEFAULT
+export OCI_REGION=us-chicago-1
+export OCI_COMPARTMENT_OCID=ocid1.compartment.oc1..your-compartment-id
+```
+
+## Regional Endpoints
+
+OCI Generative AI is available in multiple regions. The service automatically routes to the appropriate regional endpoint based on your configuration. For a full list of regional model availability, visit:
+
+https://docs.oracle.com/en-us/iaas/Content/generative-ai/overview.htm#regions
+
+## Troubleshooting
+
+### Common Issues
+
+1. **Authentication Errors**: Verify your OCI credentials and IAM policies
+2. **Model Not Found**: Ensure the model OCID is correct and the model is available in your region
+3. **Permission Denied**: Check compartment permissions and Generative AI service access
+4. **Region Unavailable**: Verify the specified region supports Generative AI services
+
+### Getting Help
+
+For additional support:
+- [OCI Generative AI Documentation](https://docs.oracle.com/en-us/iaas/Content/generative-ai/home.htm)
+- [Llama Stack Issues](https://github.com/meta-llama/llama-stack/issues)
@@ -1,7 +1,7 @@
 ---
 description: "Agents
 
-    APIs for creating and interacting with agentic systems."
+APIs for creating and interacting with agentic systems."
 sidebar_label: Agents
 title: Agents
 ---
@@ -12,6 +12,6 @@ title: Agents
 
 Agents
 
-    APIs for creating and interacting with agentic systems.
+APIs for creating and interacting with agentic systems.
 
 This section contains documentation for all available providers for the **agents** API.
@@ -1,14 +1,14 @@
 ---
 description: "The Batches API enables efficient processing of multiple requests in a single operation,
-    particularly useful for processing large datasets, batch evaluation workflows, and
-    cost-effective inference at scale.
+particularly useful for processing large datasets, batch evaluation workflows, and
+cost-effective inference at scale.
 
-    The API is designed to allow use of openai client libraries for seamless integration.
+The API is designed to allow use of openai client libraries for seamless integration.
 
-    This API provides the following extensions:
-     - idempotent batch creation
+This API provides the following extensions:
+ - idempotent batch creation
 
-    Note: This API is currently under active development and may undergo changes."
+Note: This API is currently under active development and may undergo changes."
 sidebar_label: Batches
 title: Batches
 ---
@@ -18,14 +18,14 @@ title: Batches
 ## Overview
 
 The Batches API enables efficient processing of multiple requests in a single operation,
-    particularly useful for processing large datasets, batch evaluation workflows, and
-    cost-effective inference at scale.
+particularly useful for processing large datasets, batch evaluation workflows, and
+cost-effective inference at scale.
 
-    The API is designed to allow use of openai client libraries for seamless integration.
+The API is designed to allow use of openai client libraries for seamless integration.
 
-    This API provides the following extensions:
-     - idempotent batch creation
+This API provides the following extensions:
+ - idempotent batch creation
 
-    Note: This API is currently under active development and may undergo changes.
+Note: This API is currently under active development and may undergo changes.
 
 This section contains documentation for all available providers for the **batches** API.
@@ -1,7 +1,7 @@
 ---
 description: "Evaluations
 
-    Llama Stack Evaluation API for running evaluations on model and agent candidates."
+Llama Stack Evaluation API for running evaluations on model and agent candidates."
 sidebar_label: Eval
 title: Eval
 ---
@@ -12,6 +12,6 @@ title: Eval
 
 Evaluations
 
-    Llama Stack Evaluation API for running evaluations on model and agent candidates.
+Llama Stack Evaluation API for running evaluations on model and agent candidates.
 
 This section contains documentation for all available providers for the **eval** API.
@@ -1,7 +1,7 @@
 ---
 description: "Files
 
-    This API is used to upload documents that can be used with other Llama Stack APIs."
+This API is used to upload documents that can be used with other Llama Stack APIs."
 sidebar_label: Files
 title: Files
 ---
@@ -12,6 +12,6 @@ title: Files
 
 Files
 
-    This API is used to upload documents that can be used with other Llama Stack APIs.
+This API is used to upload documents that can be used with other Llama Stack APIs.
 
 This section contains documentation for all available providers for the **files** API.
@@ -1,11 +1,11 @@
 ---
 description: "Inference
 
-    Llama Stack Inference API for generating completions, chat completions, and embeddings.
+Llama Stack Inference API for generating completions, chat completions, and embeddings.
 
-    This API provides the raw interface to the underlying models. Two kinds of models are supported:
-    - LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.
-    - Embedding models: these models generate embeddings to be used for semantic search."
+This API provides the raw interface to the underlying models. Two kinds of models are supported:
+- LLM models: these models generate \"raw\" and \"chat\" (conversational) completions.
+- Embedding models: these models generate embeddings to be used for semantic search."
 sidebar_label: Inference
 title: Inference
 ---
@@ -16,10 +16,10 @@ title: Inference
 
 Inference
 
-    Llama Stack Inference API for generating completions, chat completions, and embeddings.
+Llama Stack Inference API for generating completions, chat completions, and embeddings.
 
-    This API provides the raw interface to the underlying models. Two kinds of models are supported:
-    - LLM models: these models generate "raw" and "chat" (conversational) completions.
-    - Embedding models: these models generate embeddings to be used for semantic search.
+This API provides the raw interface to the underlying models. Two kinds of models are supported:
+- LLM models: these models generate "raw" and "chat" (conversational) completions.
+- Embedding models: these models generate embeddings to be used for semantic search.
 
 This section contains documentation for all available providers for the **inference** API.
@@ -0,0 +1,48 @@
+---
+description: |
+  Oracle Cloud Infrastructure (OCI) Generative AI inference provider for accessing OCI's Generative AI Platform-as-a-Service models.
+  Provider documentation
+  https://docs.oracle.com/en-us/iaas/Content/generative-ai/home.htm
+sidebar_label: Remote - Oci
+title: remote::oci
+---
+
+# remote::oci
+
+## Description
+
+
+Oracle Cloud Infrastructure (OCI) Generative AI inference provider for accessing OCI's Generative AI Platform-as-a-Service models.
+Provider documentation
+https://docs.oracle.com/en-us/iaas/Content/generative-ai/home.htm
+
+
+## Configuration
+
+| Field | Type | Required | Default | Description |
+|-------|------|----------|---------|-------------|
+| `oci_auth_type` | `<class 'str'>` | No | instance_principal | OCI authentication type (must be one of: instance_principal, config_file) |
+| `oci_config_file_path` | `<class 'str'>` | No | ~/.oci/config | OCI config file path (required if oci_auth_type is config_file) |
+| `oci_config_profile` | `<class 'str'>` | No | DEFAULT | OCI config profile (required if oci_auth_type is config_file) |
+| `oci_region` | `str \| None` | No |  | OCI region (e.g., us-ashburn-1) |
+| `oci_compartment_id` | `str \| None` | No |  | OCI compartment ID for the Generative AI service |
+| `oci_user_ocid` | `str \| None` | No |  | OCI user OCID for authentication |
+| `oci_tenancy_ocid` | `str \| None` | No |  | OCI tenancy OCID for authentication |
+| `oci_fingerprint` | `str \| None` | No |  | OCI API key fingerprint for authentication |
+| `oci_private_key` | `str \| None` | No |  | OCI private key for authentication |
+| `oci_serving_mode` | `<class 'str'>` | No | ON_DEMAND | OCI serving mode (must be one of: ON_DEMAND, DEDICATED) |
+
+## Sample Configuration
+
+```yaml
+oci_auth_type: ${env.OCI_AUTH_TYPE:=instance_principal}
+oci_config_file_path: ${env.OCI_CONFIG_FILE_PATH:=~/.oci/config}
+oci_config_profile: ${env.OCI_CLI_PROFILE:=DEFAULT}
+oci_region: ${env.OCI_REGION:=us-ashburn-1}
+oci_compartment_id: ${env.OCI_COMPARTMENT_OCID:=}
+oci_serving_mode: ${env.OCI_SERVING_MODE:=ON_DEMAND}
+oci_user_ocid: ${env.OCI_USER_OCID:=}
+oci_tenancy_ocid: ${env.OCI_TENANCY_OCID:=}
+oci_fingerprint: ${env.OCI_FINGERPRINT:=}
+oci_private_key: ${env.OCI_PRIVATE_KEY:=}
+```
@@ -1,7 +1,7 @@
 ---
 description: "Safety
 
-    OpenAI-compatible Moderations API."
+OpenAI-compatible Moderations API."
 sidebar_label: Safety
 title: Safety
 ---
@@ -12,6 +12,6 @@ title: Safety
 
 Safety
 
-    OpenAI-compatible Moderations API.
+OpenAI-compatible Moderations API.
 
 This section contains documentation for all available providers for the **safety** API.
@@ -0,0 +1,9 @@
+# Copyright (c) Meta Platforms, Inc. and affiliates.
+# All rights reserved.
+#
+# This source code is licensed under the terms described in the LICENSE file in
+# the root directory of this source tree.
+
+from .oci import get_distribution_template
+
+__all__ = ["get_distribution_template"]
@@ -0,0 +1,34 @@
+version: 2
+distribution_spec:
+  description: Use Oracle Cloud Infrastructure (OCI) Generative AI for running LLM
+    inference with scalable cloud services
+  providers:
+    inference:
+    - provider_type: remote::oci
+    vector_io:
+    - provider_type: inline::faiss
+    - provider_type: remote::chromadb
+    - provider_type: remote::pgvector
+    safety:
+    - provider_type: inline::llama-guard
+    agents:
+    - provider_type: inline::meta-reference
+    eval:
+    - provider_type: inline::meta-reference
+    datasetio:
+    - provider_type: remote::huggingface
+    - provider_type: inline::localfs
+    scoring:
+    - provider_type: inline::basic
+    - provider_type: inline::llm-as-judge
+    - provider_type: inline::braintrust
+    tool_runtime:
+    - provider_type: remote::brave-search
+    - provider_type: remote::tavily-search
+    - provider_type: remote::model-context-protocol
+    files:
+    - provider_type: inline::localfs
+image_type: venv
+additional_pip_packages:
+- aiosqlite
+- sqlalchemy[asyncio]