Add workflow to build vLLM-TPU wheel using PyPI tpu-inference #1241

ylangtsou · 2025-12-04T03:22:45Z

Description

A new workflow to build and test vllm-tpu using the nightly tpu-inference package from PyPI, instead of building it from source and verify that the wheel builds and installs successfully, with vllm serve starting up correctly in an E2E environment.

Key Changes:
(New) pipeline_pypi.yml:
Added a 20-minute delay step to allow time for the tpu-inference package to be published, as the schedules currently overlap, before running the benchmark tests.

(New) build_vllm_tpu.sh:
Builds the vLLM-TPU wheel by automating the cloning and patching process.
Usage: ./build_vllm_tpu.sh [vllm-branch-or-tag]

(New) Dockerfile.pypi:
Builds the vllm-tpu wheel using the nightly tpu-inference version inside the container and installs it for benchmark testing.

(Updated) setup_docker_env.sh
Added logic to switch DOCKERFILE_NAME to Dockerfile.pypi when the RUN_WITH_PYPI environment variable is set to true.

(New) run_with_pypi.sh
New entry point script that sets RUN_WITH_PYPI="true" and calls run_in_docker.sh

Tests

Test on buildkite

Checklist

Before submitting this PR, please make sure:

I have performed a self-review of my code.
I have necessary comments in my code, particularly in hard-to-understand areas.

.buildkite/scripts/run_with_pypi.sh

docker/Dockerfile.pypi

.buildkite/pipeline_pypi.yml

.buildkite/scripts/run_with_pypi.sh

docker/Dockerfile.pypi

Signed-off-by: Ylang Tsou <[email protected]>

dennisYehCienet · 2025-12-12T09:17:14Z

I think it's pretty good. Just a few things I want to mention.

It seems like you haven't included the main.yml procedure in your test.
Both steps depend on the record_verified_commit_hashes step, but your current Buildkite tests do not include it.
Please use main.yml to run a few test cases similar to our normal Buildkite procedure.

And, please provide two types of tests:
one that successfully pulls from PyPI and runs, and the other that fails after 20 minutes (due to timeout).
Please put these two test links into the PR description. You can then remove some of the testing code in the PR to make it formal.

Please update the PR description as well, as some information might be outdated. And we don't need to mention every file's changes, a brief summary of the PR is sufficient.

ylangtsou requested review from QiliangCui, jcyang43 and jrplatin as code owners December 4, 2025 03:22

dennisYehCienet removed request for QiliangCui and jrplatin December 4, 2025 03:26

dennisYehCienet added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 4, 2025

dennisYehCienet changed the title ~~Verify vllm-tpu python package~~ Verify vllm-tpu python package (draft) Dec 4, 2025

ylangtsou requested a review from vipannalla as a code owner December 4, 2025 08:57

dennisYehCienet marked this pull request as draft December 5, 2025 01:41

ylangtsou force-pushed the ylangt/run_with_pypi branch from 3c1e18f to 87a31c2 Compare December 8, 2025 01:44

CienetStingLin added bug Something isn't working and removed bug Something isn't working labels Dec 8, 2025