Skip to content

Conversation

nv-anants
Copy link
Contributor

@nv-anants nv-anants commented Aug 19, 2025

Overview:

This PR introduces CI support for running vLLM sanity checks within the GitHub Actions pipeline. It integrates a .github/workflows/container-validation-backends.yml workflow, ensuring that at minimal, the vLLM end to end functionality is automatically verified every time on public PR's.

This PR also includes build optimizations based on the container strategy DEP, which result in reducing the build time on Github for dynamo vllm container from 1 hour to 20 minutes. Build optimizations include:

  • Restructure Dockerfile.vllm to leverage the Dynamo base container instead of building Dynamo from source in the vllm container. This enables us to decouple the Dynamo and vLLM build from each other.
  • Leverage sccache s3 bucket to speed up NIXL, UCX, Dynamo, and vLLM build.
  • Add the stage definition, purpose for each stage defined in the framework Dockerfile
  • Updated build.sh to pass in new parameters such as --use_sccache, --sccache-bucket, and --sccache-region

Details:

Where should the reviewer start?

Check the Github actions workflow: NVIDIA Github Validation / Build and Test - vllm (push)

  • .github/workflows/container-validation-backends.yml
  • container/Dockerfile.vllm
  • container/build.sh
  • container/Dockerfile

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes: OPS-711, OPS-702

Summary by CodeRabbit

  • New Features

    • Optional build caching to speed container builds.
    • ARM64 build support.
    • Streamlined vLLM images with distinct runtime and local-dev targets for faster setup.
  • Refactor

    • Reworked container builds to multi-stage, configurable pipelines for improved performance and flexibility.
  • Chores

    • Added and updated CI workflows to build/test containers on GPU runners with concurrency controls and junit reporting.
  • Tests

    • Promoted an end-to-end vLLM test by removing the “slow” marker, increasing its execution frequency.

@github-actions github-actions bot added the ci Issues/PRs that reference CI build/test label Aug 19, 2025
Signed-off-by: Anant Sharma <[email protected]>
Signed-off-by: Anant Sharma <[email protected]>
@nv-tusharma nv-tusharma merged commit 82bae24 into main Aug 28, 2025
11 of 12 checks passed
@nv-tusharma nv-tusharma deleted the ci/vllm-github branch August 28, 2025 18:13
jasonqinzhou pushed a commit that referenced this pull request Aug 30, 2025
Signed-off-by: Anant Sharma <[email protected]>
Co-authored-by: Tushar Sharma <[email protected]>
Signed-off-by: Jason Zhou <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Issues/PRs that reference CI build/test size/XXL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants