Avoid installing CUDA related stuff #1246

wdhongtw · 2025-12-04T19:32:31Z

Description

Avoid installing CUDA related stuff

Use PyTorch CPU version so we avoid installing CUDA.

This modification keeps the functionality and reduce image size by about 7.7GB.

wdhongtw/vllm-tpu   latest   d055fd2151a0   22 minutes ago      11.8GB
wdhongtw/vllm-tpu   base     07dbf76dbed8   About an hour ago   19.5GB

See official doc for recommended way to install CPU version of PyTorch.

The pip list output before and after generate following diff

--- ./cuda.txt  2025-12-10 14:22:24.124896422 +0000
+++ ./cpu.txt   2025-12-10 14:24:50.718893457 +0000
@@ -101,15 +100,0 @@
-nvidia-cublas-cu12                 12.8.4.1
-nvidia-cuda-cupti-cu12             12.8.90
-nvidia-cuda-nvrtc-cu12             12.8.93
-nvidia-cuda-runtime-cu12           12.8.90
-nvidia-cudnn-cu12                  9.10.2.21
-nvidia-cufft-cu12                  11.3.3.83
-nvidia-cufile-cu12                 1.13.1.3
-nvidia-curand-cu12                 10.3.9.90
-nvidia-cusolver-cu12               11.7.3.90
-nvidia-cusparse-cu12               12.5.8.93
-nvidia-cusparselt-cu12             0.7.1
-nvidia-nccl-cu12                   2.27.3
-nvidia-nvjitlink-cu12              12.8.93
-nvidia-nvshmem-cu12                3.3.20
-nvidia-nvtx-cu12                   12.8.90
@@ -197 +182 @@
-torch                              2.8.0
+torch                              2.8.0+cpu
@@ -200 +185 @@
-torchvision                        0.23.0
+torchvision                        0.23.0+cpu
@@ -206 +191 @@
-triton                             3.4.0
+triton                             3.5.1

With --extra-index-url, pip now can see cuda version and cpu version,
and the cpu version has higher priority.

PyTorch decide to make the "canonical" torch package on Linux platform be the cuda-ready one,
and put the cpu-ready version on https://download.pytorch.org/whl/cpu.
(this is not true on macOS and Windows)
When using --index-url, pip will install the +cpu version, since that the index url
is now the only search space for this pip invocation, +... is called the local version identifier.
We use --extra-index-url here so that we can install torch +cpu version while install other
packages in the requirements.txt file in one pip invocation For torch and torchvision,
we use packages from https://download.pytorch.org/whl/cpu, and for other packages, we use packages
from PyPI directly.
Now pip can see PyTorch 2.8.0 and 2.8.0+cpu at the same time, but because the one with local version
tag has higher priority, 2.8.0+cpu is installed.

From PEP440

Additionally a local version with a great number of segments will always compare as greater than a local version with fewer segments

Tests

Build the image and run benchmarking in the container.

Checklist

Before submitting this PR, please make sure:

I have performed a self-review of my code.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have made or will make corresponding changes to any relevant documentation.

QiliangCui

avoid installing CUDA will be great!!

Can we do this change after #1245 so that we can have a cleaner base to diff?

wdhongtw · 2025-12-08T08:10:51Z

avoid installing CUDA will be great!!

Can we do this change after #1245 so that we can have a cleaner base to diff?

If I remember correctly, GitHub will update the diff in related PRs automatically if some commits in the PRs are already existed in target branch.

Let's just wait another PR to be merged first. :D

wdhongtw · 2025-12-10T05:49:39Z

avoid installing CUDA will be great!!

Can we do this change after #1245 so that we can have a cleaner base to diff?

Seems that my understanding was wrong when the repo enable linear-history, it results in conflict state.
I rebased again and the diff looks correct now.

@QiliangCui

- Use PyTorch CPU version so we avoid installing CUDA. Signed-off-by: Weida Hong <[email protected]>

kyuyeunk · 2025-12-12T19:24:10Z

Fixes #921

wdhongtw · 2025-12-13T02:07:19Z

No, this change does not fix #921, which requires the PyPI vllm-tpu to use torch==...+cpu version.

It's by design that there is and no way to propagate index url information through the standard Python package metadata *1, so even when we specify torch==...+cpu dependency in the metadata, user need to specify --extra-index-url when installing vllm-tpu so pip tool chain can find torch==...+cpu

And for existing version (before this PR), users can install vllm-tpu with CPU version torch by

uv pip install vllm-tpu --extra-index-url https://download.pytorch.org/whl/cpu --index-strategy unsafe-best-match

--index-strategy unsafe-best-match is required for some reason, not dig into it yet.

*1: I may be wrong about this claim, maybe we need other experienced engineer to validate my conclusion here.

To solve 921 completely, we probably need to push PyTorch to release something like torch-cpu on PyPI directly.

kyuyeunk · 2025-12-13T02:28:10Z

Ah got it. Thanks for clarifying it!

wdhongtw requested review from QiliangCui and jrplatin as code owners December 4, 2025 19:32

QiliangCui requested changes Dec 8, 2025

View reviewed changes

wdhongtw force-pushed the avoid-cuda branch from 57e4c8f to b2a7b04 Compare December 10, 2025 05:42

QiliangCui approved these changes Dec 12, 2025

View reviewed changes

QiliangCui added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 12, 2025

wdhongtw force-pushed the avoid-cuda branch 2 times, most recently from e1d7815 to 48f2560 Compare December 12, 2025 07:47

Avoid installing CUDA related stuff

357dbe8

- Use PyTorch CPU version so we avoid installing CUDA. Signed-off-by: Weida Hong <[email protected]>

wdhongtw force-pushed the avoid-cuda branch from 48f2560 to 357dbe8 Compare December 12, 2025 12:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid installing CUDA related stuff #1246

Avoid installing CUDA related stuff #1246

Uh oh!

wdhongtw commented Dec 4, 2025 •

edited

Loading

Uh oh!

QiliangCui left a comment

Uh oh!

wdhongtw commented Dec 8, 2025

Uh oh!

wdhongtw commented Dec 10, 2025

Uh oh!

kyuyeunk commented Dec 12, 2025

Uh oh!

wdhongtw commented Dec 13, 2025 •

edited

Loading

Uh oh!

kyuyeunk commented Dec 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Avoid installing CUDA related stuff #1246

Are you sure you want to change the base?

Avoid installing CUDA related stuff #1246

Uh oh!

Conversation

wdhongtw commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tests

Checklist

Uh oh!

QiliangCui left a comment

Choose a reason for hiding this comment

Uh oh!

wdhongtw commented Dec 8, 2025

Uh oh!

wdhongtw commented Dec 10, 2025

Uh oh!

kyuyeunk commented Dec 12, 2025

Uh oh!

wdhongtw commented Dec 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kyuyeunk commented Dec 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wdhongtw commented Dec 4, 2025 •

edited

Loading

wdhongtw commented Dec 13, 2025 •

edited

Loading