Skip to content

Commit 6bd3742

Browse files
authored
Merge pull request #632 from sargam-modak/main
Added Readme for gpt-oss MD
2 parents 60d9d79 + 649e2f5 commit 6bd3742

File tree

5 files changed

+26
-1
lines changed

5 files changed

+26
-1
lines changed

ai-quick-actions/ai-quick-actions-containers.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
| Server | Version |Supported Formats|Supported Shapes| Supported Models/Architectures |
44
|-----------------------------------------------------------------------------------------------------------------|-------------|-----------------|----------------|---------------------------------------------------------------------------------------------------------------------------------|
5-
| [vLLM](https://github.com/vllm-project/vllm/releases/tag/v0.9.1) | 0.9.1 |safe-tensors|A10, A100, H100| [v0.9.1 supported models](https://docs.vllm.ai/en/v0.9.1/models/supported_models.html) |
5+
| [vLLM](https://github.com/vllm-project/vllm/releases/tag/v0.10.1) | 0.10.1 |safe-tensors|A10, A100, H100| [v0.10.1 supported models](https://docs.vllm.ai/en/v0.10.1/models/supported_models.html) |
66
| [Text Generation Inference (TGI)](https://github.com/huggingface/text-generation-inference/releases/tag/v3.2.1) | 3.2.1 |safe-tensors|A10, A100, H100| [v3.2.1 supported models](https://github.com/huggingface/text-generation-inference/blob/v3.2.1/docs/source/supported_models.md) |
77
| [Llama-cpp](https://github.com/abetlen/llama-cpp-python/releases/tag/v0.3.7) | 0.3.7 |gguf|Amphere ARM| [v0.3.7 supported models](https://github.com/abetlen/llama-cpp-python/tree/v0.3.7) |
88

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# Model Deployment - GPT-OSS
2+
3+
OpenAI has announced the release of [two open weight models](https://openai.com/index/introducing-gpt-oss/), gpt-oss-120b and gpt-oss-20b, their first open-weight language models since GPT‑2. According to OpenAI, their performance are on par or exceed OpenAI's internal models, and both models perform strongly on tool use, few-shot function calling, CoT reasoning and HealthBench.
4+
5+
Here are the new OpenAI open weight models:
6+
7+
* gpt-oss-120b — designed for production, general-purpose and high-reasoning use cases. The model has 117B parameters with 5.1B active parameters
8+
* gpt-oss-20b — designed for lower latency and local or specialized use cases. The model has 21B parameters with 3.6B active parameters
9+
10+
Both models are now available in OCI Data Science AI Quick Actions. The models are cached in our service and readily available to be deployed and fine tuned, without the need for customers to bring in the model artifacts from external sites. By using AI Quick Actions, customers can leverage our service managed container with the latest vllm version that supports both of the models, eliminating the need to build or bring your own container for working with the models.
11+
12+
![Deploy Model](web_assets/openai_modelcard.png)
13+
14+
![GPT-OSS-20b Model card](web_assets/model-deploy-gptoss.png)
15+
16+
17+
## Deploying an LLM
18+
19+
After picking a model from the model explorer, if the "Deploy Model" is enabled you can use this
20+
form to quickly deploy the model:
21+
22+
![Deploy Model](web_assets/model-deploy-gptoss-2.png)
23+
24+
25+
To know more on model deployments you can refer [Model Deployment Tips](model-deployment-tips.md) page.
230 KB
Loading
310 KB
Loading
409 KB
Loading

0 commit comments

Comments
 (0)