Skip to content

Commit 62eed3b

Browse files
feat: add oci genai service as chat inference provider
1 parent a8a8aa5 commit 62eed3b

File tree

15 files changed

+1074
-0
lines changed

15 files changed

+1074
-0
lines changed
Lines changed: 143 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,143 @@
1+
---
2+
orphan: true
3+
---
4+
<!-- This file was auto-generated by distro_codegen.py, please edit source -->
5+
# OCI Distribution
6+
7+
The `llamastack/distribution-oci` distribution consists of the following provider configurations.
8+
9+
| API | Provider(s) |
10+
|-----|-------------|
11+
| agents | `inline::meta-reference` |
12+
| datasetio | `remote::huggingface`, `inline::localfs` |
13+
| eval | `inline::meta-reference` |
14+
| files | `inline::localfs` |
15+
| inference | `remote::oci` |
16+
| safety | `inline::llama-guard` |
17+
| scoring | `inline::basic`, `inline::llm-as-judge`, `inline::braintrust` |
18+
| tool_runtime | `remote::brave-search`, `remote::tavily-search`, `inline::rag-runtime`, `remote::model-context-protocol` |
19+
| vector_io | `inline::faiss`, `remote::chromadb`, `remote::pgvector` |
20+
21+
22+
### Environment Variables
23+
24+
The following environment variables can be configured:
25+
26+
- `OCI_AUTH_TYPE`: OCI authentication type (instance_principal or config_file) (default: `instance_principal`)
27+
- `OCI_REGION`: OCI region (e.g., us-ashburn-1, us-chicago-1, us-phoenix-1, eu-frankfurt-1) (default: ``)
28+
- `OCI_COMPARTMENT_OCID`: OCI compartment ID for the Generative AI service (default: ``)
29+
- `OCI_CONFIG_FILE_PATH`: OCI config file path (required if OCI_AUTH_TYPE is config_file) (default: `~/.oci/config`)
30+
- `OCI_CLI_PROFILE`: OCI CLI profile name to use from config file (default: `DEFAULT`)
31+
32+
33+
## Prerequisites
34+
### Oracle Cloud Infrastructure Setup
35+
36+
Before using the OCI Generative AI distribution, ensure you have:
37+
38+
1. **Oracle Cloud Infrastructure Account**: Sign up at [Oracle Cloud Infrastructure](https://cloud.oracle.com/)
39+
2. **Generative AI Service Access**: Enable the Generative AI service in your OCI tenancy
40+
3. **Compartment**: Create or identify a compartment where you'll deploy Generative AI models
41+
4. **Authentication**: Configure authentication using either:
42+
- **Instance Principal** (recommended for cloud-hosted deployments)
43+
- **API Key** (for on-premises or development environments)
44+
45+
### Authentication Methods
46+
47+
#### Instance Principal Authentication (Recommended)
48+
Instance Principal authentication allows OCI resources to authenticate using the identity of the compute instance they're running on. This is the most secure method for production deployments.
49+
50+
Requirements:
51+
- Instance must be running in an Oracle Cloud Infrastructure compartment
52+
- Instance must have appropriate IAM policies to access Generative AI services
53+
54+
#### API Key Authentication
55+
For development or on-premises deployments, follow [this doc](https://docs.oracle.com/en-us/iaas/Content/API/Concepts/apisigningkey.htm) to learn how to create your API signing key for your config file.
56+
57+
### Required IAM Policies
58+
59+
Ensure your OCI user or instance has the following policy statements:
60+
61+
```
62+
Allow group <group_name> to use generative-ai-inference-endpoints in compartment <compartment_name>
63+
Allow group <group_name> to manage generative-ai-inference-endpoints in compartment <compartment_name>
64+
```
65+
66+
## Supported Services
67+
68+
### Inference: OCI Generative AI
69+
Oracle Cloud Infrastructure Generative AI provides access to high-performance AI models through OCI's Platform-as-a-Service offering. The service supports:
70+
71+
- **Chat Completions**: Conversational AI with context awareness
72+
- **Text Generation**: Complete prompts and generate text content
73+
74+
#### Available Models
75+
Common OCI Generative AI models include access to Meta, Cohere, OpenAI, Grok, and more models.
76+
77+
### Safety: Llama Guard
78+
For content safety and moderation, this distribution uses Meta's LlamaGuard model through the OCI Generative AI service to provide:
79+
- Content filtering and moderation
80+
- Policy compliance checking
81+
- Harmful content detection
82+
83+
### Vector Storage: Multiple Options
84+
The distribution supports several vector storage providers:
85+
- **FAISS**: Local in-memory vector search
86+
- **ChromaDB**: Distributed vector database
87+
- **PGVector**: PostgreSQL with vector extensions
88+
89+
### Additional Services
90+
- **Dataset I/O**: Local filesystem and Hugging Face integration
91+
- **Tool Runtime**: Web search (Brave, Tavily) and RAG capabilities
92+
- **Evaluation**: Meta reference evaluation framework
93+
94+
## Running Llama Stack with OCI
95+
96+
You can run the OCI distribution via Docker or local virtual environment.
97+
98+
### Via venv
99+
100+
If you've set up your local development environment, you can also build the image using your local virtual environment.
101+
102+
```bash
103+
OCI_AUTH=$OCI_AUTH_TYPE OCI_REGION=$OCI_REGION OCI_COMPARTMENT_OCID=$OCI_COMPARTMENT_OCID llama stack run --port 8321 oci
104+
```
105+
106+
### Configuration Examples
107+
108+
#### Using Instance Principal (Recommended for Production)
109+
```bash
110+
export OCI_AUTH_TYPE=instance_principal
111+
export OCI_REGION=us-chicago-1
112+
export OCI_COMPARTMENT_OCID=ocid1.compartment.oc1..<your-compartment-id>
113+
```
114+
115+
#### Using API Key Authentication (Development)
116+
```bash
117+
export OCI_AUTH_TYPE=config_file
118+
export OCI_CONFIG_FILE_PATH=~/.oci/config
119+
export OCI_CLI_PROFILE=DEFAULT
120+
export OCI_REGION=us-chicago-1
121+
export OCI_COMPARTMENT_OCID=ocid1.compartment.oc1..your-compartment-id
122+
```
123+
124+
## Regional Endpoints
125+
126+
OCI Generative AI is available in multiple regions. The service automatically routes to the appropriate regional endpoint based on your configuration. For a full list of regional model availability, visit:
127+
128+
https://docs.oracle.com/en-us/iaas/Content/generative-ai/overview.htm#regions
129+
130+
## Troubleshooting
131+
132+
### Common Issues
133+
134+
1. **Authentication Errors**: Verify your OCI credentials and IAM policies
135+
2. **Model Not Found**: Ensure the model OCID is correct and the model is available in your region
136+
3. **Permission Denied**: Check compartment permissions and Generative AI service access
137+
4. **Region Unavailable**: Verify the specified region supports Generative AI services
138+
139+
### Getting Help
140+
141+
For additional support:
142+
- [OCI Generative AI Documentation](https://docs.oracle.com/en-us/iaas/Content/generative-ai/home.htm)
143+
- [Llama Stack Issues](https://github.com/meta-llama/llama-stack/issues)
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
---
2+
description: |
3+
Oracle Cloud Infrastructure (OCI) Generative AI inference provider for accessing OCI's Generative AI Platform-as-a-Service models.
4+
Provider documentation
5+
https://docs.oracle.com/en-us/iaas/Content/generative-ai/home.htm
6+
sidebar_label: Remote - Oci
7+
title: remote::oci
8+
---
9+
10+
# remote::oci
11+
12+
## Description
13+
14+
15+
Oracle Cloud Infrastructure (OCI) Generative AI inference provider for accessing OCI's Generative AI Platform-as-a-Service models.
16+
Provider documentation
17+
https://docs.oracle.com/en-us/iaas/Content/generative-ai/home.htm
18+
19+
20+
## Configuration
21+
22+
| Field | Type | Required | Default | Description |
23+
|-------|------|----------|---------|-------------|
24+
| `allowed_models` | `list[str \| None` | No | | List of models that should be registered with the model registry. If None, all models are allowed. |
25+
| `refresh_models` | `<class 'bool'>` | No | False | Whether to refresh models periodically from the provider |
26+
| `api_key` | `pydantic.types.SecretStr \| None` | No | | Authentication credential for the provider |
27+
| `oci_auth_type` | `<class 'str'>` | No | instance_principal | OCI authentication type (must be one of: instance_principal, config_file) |
28+
| `oci_region` | `<class 'str'>` | No | us-ashburn-1 | OCI region (e.g., us-ashburn-1) |
29+
| `oci_compartment_id` | `<class 'str'>` | No | | OCI compartment ID for the Generative AI service |
30+
| `oci_config_file_path` | `<class 'str'>` | No | ~/.oci/config | OCI config file path (required if oci_auth_type is config_file) |
31+
| `oci_config_profile` | `<class 'str'>` | No | DEFAULT | OCI config profile (required if oci_auth_type is config_file) |
32+
33+
## Sample Configuration
34+
35+
```yaml
36+
oci_auth_type: ${env.OCI_AUTH_TYPE:=instance_principal}
37+
oci_config_file_path: ${env.OCI_CONFIG_FILE_PATH:=~/.oci/config}
38+
oci_config_profile: ${env.OCI_CLI_PROFILE:=DEFAULT}
39+
oci_region: ${env.OCI_REGION:=us-ashburn-1}
40+
oci_compartment_id: ${env.OCI_COMPARTMENT_OCID:=}
41+
```
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# Copyright (c) Meta Platforms, Inc. and affiliates.
2+
# All rights reserved.
3+
#
4+
# This source code is licensed under the terms described in the LICENSE file in
5+
# the root directory of this source tree.
6+
7+
from .oci import get_distribution_template # noqa: F401
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
version: 2
2+
distribution_spec:
3+
description: Use Oracle Cloud Infrastructure (OCI) Generative AI for running LLM
4+
inference with scalable cloud services
5+
providers:
6+
inference:
7+
- provider_type: remote::oci
8+
vector_io:
9+
- provider_type: inline::faiss
10+
- provider_type: remote::chromadb
11+
- provider_type: remote::pgvector
12+
safety:
13+
- provider_type: inline::llama-guard
14+
agents:
15+
- provider_type: inline::meta-reference
16+
eval:
17+
- provider_type: inline::meta-reference
18+
datasetio:
19+
- provider_type: remote::huggingface
20+
- provider_type: inline::localfs
21+
scoring:
22+
- provider_type: inline::basic
23+
- provider_type: inline::llm-as-judge
24+
- provider_type: inline::braintrust
25+
tool_runtime:
26+
- provider_type: remote::brave-search
27+
- provider_type: remote::tavily-search
28+
- provider_type: inline::rag-runtime
29+
- provider_type: remote::model-context-protocol
30+
files:
31+
- provider_type: inline::localfs
32+
image_type: venv
33+
additional_pip_packages:
34+
- aiosqlite
35+
- sqlalchemy[asyncio]
Lines changed: 140 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,140 @@
1+
---
2+
orphan: true
3+
---
4+
# OCI Distribution
5+
6+
The `llamastack/distribution-{{ name }}` distribution consists of the following provider configurations.
7+
8+
{{ providers_table }}
9+
10+
{% if run_config_env_vars %}
11+
### Environment Variables
12+
13+
The following environment variables can be configured:
14+
15+
{% for var, (default_value, description) in run_config_env_vars.items() %}
16+
- `{{ var }}`: {{ description }} (default: `{{ default_value }}`)
17+
{% endfor %}
18+
{% endif %}
19+
20+
{% if default_models %}
21+
### Models
22+
23+
The following models are available by default:
24+
25+
{% for model in default_models %}
26+
- `{{ model.model_id }} {{ model.doc_string }}`
27+
{% endfor %}
28+
{% endif %}
29+
30+
## Prerequisites
31+
### Oracle Cloud Infrastructure Setup
32+
33+
Before using the OCI Generative AI distribution, ensure you have:
34+
35+
1. **Oracle Cloud Infrastructure Account**: Sign up at [Oracle Cloud Infrastructure](https://cloud.oracle.com/)
36+
2. **Generative AI Service Access**: Enable the Generative AI service in your OCI tenancy
37+
3. **Compartment**: Create or identify a compartment where you'll deploy Generative AI models
38+
4. **Authentication**: Configure authentication using either:
39+
- **Instance Principal** (recommended for cloud-hosted deployments)
40+
- **API Key** (for on-premises or development environments)
41+
42+
### Authentication Methods
43+
44+
#### Instance Principal Authentication (Recommended)
45+
Instance Principal authentication allows OCI resources to authenticate using the identity of the compute instance they're running on. This is the most secure method for production deployments.
46+
47+
Requirements:
48+
- Instance must be running in an Oracle Cloud Infrastructure compartment
49+
- Instance must have appropriate IAM policies to access Generative AI services
50+
51+
#### API Key Authentication
52+
For development or on-premises deployments, follow [this doc](https://docs.oracle.com/en-us/iaas/Content/API/Concepts/apisigningkey.htm) to learn how to create your API signing key for your config file.
53+
54+
### Required IAM Policies
55+
56+
Ensure your OCI user or instance has the following policy statements:
57+
58+
```
59+
Allow group <group_name> to use generative-ai-inference-endpoints in compartment <compartment_name>
60+
Allow group <group_name> to manage generative-ai-inference-endpoints in compartment <compartment_name>
61+
```
62+
63+
## Supported Services
64+
65+
### Inference: OCI Generative AI
66+
Oracle Cloud Infrastructure Generative AI provides access to high-performance AI models through OCI's Platform-as-a-Service offering. The service supports:
67+
68+
- **Chat Completions**: Conversational AI with context awareness
69+
- **Text Generation**: Complete prompts and generate text content
70+
71+
#### Available Models
72+
Common OCI Generative AI models include access to Meta, Cohere, OpenAI, Grok, and more models.
73+
74+
### Safety: Llama Guard
75+
For content safety and moderation, this distribution uses Meta's LlamaGuard model through the OCI Generative AI service to provide:
76+
- Content filtering and moderation
77+
- Policy compliance checking
78+
- Harmful content detection
79+
80+
### Vector Storage: Multiple Options
81+
The distribution supports several vector storage providers:
82+
- **FAISS**: Local in-memory vector search
83+
- **ChromaDB**: Distributed vector database
84+
- **PGVector**: PostgreSQL with vector extensions
85+
86+
### Additional Services
87+
- **Dataset I/O**: Local filesystem and Hugging Face integration
88+
- **Tool Runtime**: Web search (Brave, Tavily) and RAG capabilities
89+
- **Evaluation**: Meta reference evaluation framework
90+
91+
## Running Llama Stack with OCI
92+
93+
You can run the OCI distribution via Docker or local virtual environment.
94+
95+
### Via venv
96+
97+
If you've set up your local development environment, you can also build the image using your local virtual environment.
98+
99+
```bash
100+
OCI_AUTH=$OCI_AUTH_TYPE OCI_REGION=$OCI_REGION OCI_COMPARTMENT_OCID=$OCI_COMPARTMENT_OCID llama stack run --port 8321 oci
101+
```
102+
103+
### Configuration Examples
104+
105+
#### Using Instance Principal (Recommended for Production)
106+
```bash
107+
export OCI_AUTH_TYPE=instance_principal
108+
export OCI_REGION=us-chicago-1
109+
export OCI_COMPARTMENT_OCID=ocid1.compartment.oc1..<your-compartment-id>
110+
```
111+
112+
#### Using API Key Authentication (Development)
113+
```bash
114+
export OCI_AUTH_TYPE=config_file
115+
export OCI_CONFIG_FILE_PATH=~/.oci/config
116+
export OCI_CLI_PROFILE=DEFAULT
117+
export OCI_REGION=us-chicago-1
118+
export OCI_COMPARTMENT_OCID=ocid1.compartment.oc1..your-compartment-id
119+
```
120+
121+
## Regional Endpoints
122+
123+
OCI Generative AI is available in multiple regions. The service automatically routes to the appropriate regional endpoint based on your configuration. For a full list of regional model availability, visit:
124+
125+
https://docs.oracle.com/en-us/iaas/Content/generative-ai/overview.htm#regions
126+
127+
## Troubleshooting
128+
129+
### Common Issues
130+
131+
1. **Authentication Errors**: Verify your OCI credentials and IAM policies
132+
2. **Model Not Found**: Ensure the model OCID is correct and the model is available in your region
133+
3. **Permission Denied**: Check compartment permissions and Generative AI service access
134+
4. **Region Unavailable**: Verify the specified region supports Generative AI services
135+
136+
### Getting Help
137+
138+
For additional support:
139+
- [OCI Generative AI Documentation](https://docs.oracle.com/en-us/iaas/Content/generative-ai/home.htm)
140+
- [Llama Stack Issues](https://github.com/meta-llama/llama-stack/issues)

0 commit comments

Comments
 (0)