GPT OSS model on GCP C4 #3107

jiqing-feng · 2025-09-26T07:25:44Z

This blog shows: Intel Granite Rapids (C4) provides both performance gains and better cost efficiency for large MoE inference than Sapphire Rapids (C3).

* running gpt-oss on Intel Xeon Signed-off-by: jiqing-feng <[email protected]> * add TTFT image Signed-off-by: jiqing-feng <[email protected]> * add _blog.yml Signed-off-by: jiqing-feng <[email protected]> * minor fix Signed-off-by: jiqing-feng <[email protected]> * fix blog Signed-off-by: jiqing-feng <[email protected]> * fix content Signed-off-by: jiqing-feng <[email protected]> * update thumbnail Signed-off-by: jiqing-feng <[email protected]> * update expert parallelism diagram Signed-off-by: jiqing-feng <[email protected]> * fix model name and model link Signed-off-by: jiqing-feng <[email protected]> * fix result image links Signed-off-by: jiqing-feng <[email protected]> * fix script Signed-off-by: jiqing-feng <[email protected]> * update results Signed-off-by: jiqing-feng <[email protected]> --------- Signed-off-by: jiqing-feng <[email protected]>

Signed-off-by: jiqing-feng <[email protected]>

yao-matrix · 2025-09-30T15:51:56Z

@kding1 @IlyasMoutawwakil , pls help review, thx very much

IlyasMoutawwakil · 2025-10-14T08:03:18Z

gpt-oss-on-intel-xeon.md

+## Results
+### Normalized Throughput per vCPU
+Across batch sizes up to 64, Intel Xeon 6 processor‑powered `C4` consistently outperforms `C3` with a 1.4x ~ 1.7× throughput per-vCPU. The formula is:
+
+$$normalized\\_throughput\\_per\\_vCPU = (throughput\\_C4 / vCPUs\\_C4) / (throughput\\_C3 / vCPUs\\_C3)$$
+
+<kbd>
+  <img src="assets/gpt-oss-on-intel-xeon/throughput-gpt-oss-per-vcpu.png">
+</kbd>


would it make sense to also include throughput numbers ? I understand that normalized throughput has increased but I don't see from where to where, as in what throughput should one expect from deployment on this cpu instance.

IlyasMoutawwakil

LGTM ! I think it can be worth it to add un-normalized metrics as well, like TTFT, TPS and TPOT.

jiqing-feng added 6 commits September 10, 2025 08:48

Merge branch 'main' into main

29c3655

update blog

78a6364

Signed-off-by: jiqing-feng <[email protected]>

fix format

e3c4d57

Signed-off-by: jiqing-feng <[email protected]>

Merge branch 'main' into intel-upstream

7f35951

rm GNR powered

09f93bd

Signed-off-by: jiqing-feng <[email protected]>

jiqing-feng marked this pull request as draft September 30, 2025 14:46

jiqing-feng marked this pull request as ready for review September 30, 2025 14:46

yao-matrix added 5 commits September 30, 2025 07:51

Merge branch 'main' into intel-upstream

f3cf5bb

update per Kartik's comments

2d92c12

Update gpt-oss-on-intel-xeon.md

0817882

Update gpt-oss-on-intel-xeon.md

1e4995c

Update gpt-oss-on-intel-xeon.md

3966d62

yao-matrix added 4 commits September 30, 2025 08:52

Update gpt-oss-on-intel-xeon.md

7f44c97

Merge branch 'main' into intel-upstream

258d123

Update gpt-oss-on-intel-xeon.md

0845026

Merge branch 'main' into intel-upstream

c4ac986

IlyasMoutawwakil reviewed Oct 14, 2025

View reviewed changes

IlyasMoutawwakil approved these changes Oct 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GPT OSS model on GCP C4 #3107

GPT OSS model on GCP C4 #3107

Uh oh!

jiqing-feng commented Sep 26, 2025

Uh oh!

yao-matrix commented Sep 30, 2025

Uh oh!

IlyasMoutawwakil Oct 14, 2025

Uh oh!

IlyasMoutawwakil left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

GPT OSS model on GCP C4 #3107

Are you sure you want to change the base?

GPT OSS model on GCP C4 #3107

Uh oh!

Conversation

jiqing-feng commented Sep 26, 2025

Uh oh!

yao-matrix commented Sep 30, 2025

Uh oh!

IlyasMoutawwakil Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

IlyasMoutawwakil left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants