Conversation
Add standalone onnxruntime-ep-webgpu Python package that bundles the WebGPU plugin EP native binary (+ DXC deps on Windows). The package provides get_library_path() and get_ep_name() helpers for registering the EP with ONNX Runtime. New files in plugin-ep-webgpu/: VERSION_NUMBER, pyproject.toml, setup.py, __init__.py, build_wheel.py (handles binary copying, version stamping, auditwheel repair on Linux, and wheel verification), requirements-build-wheel.txt, and a smoke test that validates import, EP registration, and inference. Pipeline changes: added Python_Package (CPU) and Python_Test (GPU) jobs to each platform stage (Windows, Linux, macOS). Added PluginPythonPackageVersion (PEP 440) output to set-plugin-build-variables-step.yml, sourced from plugin-ep-webgpu/VERSION_NUMBER.
Move user-facing README (installation, usage) into onnxruntime_ep_webgpu/ so it is bundled in the wheel and shown on PyPI. Add developer-facing README in plugin-ep-webgpu/python/ with build and test instructions.
The package has no CPython extension modules, only pre-built native libraries, so a single wheel works across all Python versions. Override bdist_wheel.get_tag() to produce py3-none-{platform} instead of cp3XX-cp3XX-{platform}.
…version stamp - Build wheel in a temporary directory instead of mutating the source tree - Copy only the files needed (pyproject.toml, setup.py, onnxruntime_ep_webgpu/) instead of using an exclude list - Change version placeholder to VERSION_PLACEHOLDER and fail hard if not found - Disable CPU EP fallback in test to ensure WebGPU EP runs the model - Simplify docstring and README descriptions
- macOS/Windows: add setup-build-tools.yml to Python Package and Test jobs - Linux: run Python packaging and testing inside Docker for manylinux compatibility and auditwheel support - Windows: skip Python package/test jobs for arm64 (cross-compiled, can't run on x64 agents) - Linux: add gpu_machine_pool parameter for test job pool
Print environment info (Python version, platform, ORT version, relevant env vars), package directory contents, library file size, device enumeration details, session providers, and full tracebacks on failure.
Apple Silicon requires all executable code to be signed. Without this, dlopen triggers a SIGBUS (bus error) when loading the unsigned dylib.
ESRP requires .zip or .dmg input. Zip the dylib before signing, then unzip the signed result and verify.
The Docker image does not have pip pre-installed. Use ensurepip to bootstrap it before installing wheel build dependencies.
Use python -u in all three platform pipelines so prints are flushed immediately, even if the process crashes during native DLL load. Add ort.set_default_logger_severity(0) in the test script for verbose ORT logging.
Remove reinterpret_cast of OrtKernelInfo* to internal OpKernelInfo* that breaks ABI across DLL boundaries (vtable mismatch between plugin EP and ORT core). - KernelInfoCache: use Ort::ConstKernelInfo::GetEp() instead of casting to OpKernelInfo* and calling GetExecutionProvider()->GetOrtEp() - GetAllocator: use C API KernelInfoGetAllocator + IAllocatorImplWrappingOrtAllocator instead of casting to OpKernelInfo* - Remove #include core/framework/op_kernel_info.h (no longer needed) - Add #include core/session/allocator_adapters.h for IAllocatorImplWrappingOrtAllocator
…p_adapter_cast_issue
…ast_issue' into edgchen1/webgpu_packaging_python_fix
Materialize the glob generator to a list so the emptiness check works, and delete each raw wheel after auditwheel repair so only the manylinux wheel remains.
Add the missing set-nightly-build-option-variable-step.yml template to all three platform Python_Package jobs for consistency with the Build jobs.
Materialize the glob generator to a list so the emptiness check works, and delete each raw wheel after auditwheel repair so only the manylinux wheel remains.
Add the missing set-nightly-build-option-variable-step.yml template to all three platform Python_Package jobs for consistency with the Build jobs.
…/microsoft/onnxruntime into edgchen1/webgpu_packaging_python
- build_wheel.py: materialize glob() generators with list() so empty-wheel checks in collect_wheels() and auditwheel_repair() actually trigger. - plugin-linux-webgpu-stage.yml: remove duplicate setup-feeds-and-python-steps.yml include. - test_webgpu_plugin_ep.py: have create_mul_model() write into a caller-provided directory and use TemporaryDirectory() for cleanup.
- Move the minimum onnxruntime version into plugin-ep-webgpu/MIN_ONNXRUNTIME_VERSION so future plugin EP packages can reuse it. - Convert pyproject.toml into pyproject.toml.in with @var@ template markers. - Add a small gen_file_from_template() helper in build_wheel.py (modeled on tools/ci_build/github/apple/package_assembly_utils.py) and use it to render pyproject.toml at build time. - Replace ad-hoc print+sys.exit(1) error paths with raised exceptions for consistency. - Make auditwheel_repair fail loudly if there is no wheel to repair or the repair produces no output.
- gen_file_from_template: docstring caveat that values are inserted verbatim and the caller is responsible for any required quoting/escaping. - plugin-ep-webgpu/python/README.md: note that pip install / pip wheel against this directory is unsupported (source has pyproject.toml.in, not pyproject.toml). - WebGPU Dockerfile: SwiftShader pin comment now points at Dawn's DEPS (third_party/swiftshader) and notes the SHA tracks the Dawn commit pinned in cmake/deps.txt, with a refresh URL template.
|
Example packaging pipeline runs: |
There was a problem hiding this comment.
Pull request overview
Adds CI packaging and smoke-testing support for a new WebGPU plugin EP Python wheel (onnxruntime-ep-webgpu) alongside the existing plugin packaging pipeline.
Changes:
- Introduces a WebGPU-specific Linux Docker image (with SwiftShader) and scripts to build/test the wheel in CI.
- Extends plugin pipeline versioning to support a Python (PEP 440) version string and configurable version file path.
- Adds platform test stages/pipeline (Win/Linux/macOS) that install the produced wheel and run a smoke test.
Reviewed changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/ci_build/github/linux/docker/inference/x86_64/python/webgpu/scripts/install_centos.sh | Adds CentOS/AlmaLinux dependencies for the WebGPU wheel build/test container. |
| tools/ci_build/github/linux/docker/inference/x86_64/python/webgpu/Dockerfile | Builds SwiftShader (software Vulkan) and produces a reusable runtime image for Linux testing. |
| tools/ci_build/github/linux/build_webgpu_plugin_python_package.sh | Adds a Docker-based wheel build wrapper for Linux CI. |
| tools/ci_build/github/azure-pipelines/templates/set-plugin-build-variables-step.yml | Adds version_file input and emits a PEP 440 Python version variable. |
| tools/ci_build/github/azure-pipelines/stages/plugin-win-webgpu-test-stage.yml | Adds Windows wheel install + smoke test stage. |
| tools/ci_build/github/azure-pipelines/stages/plugin-win-webgpu-stage.yml | Adds Windows x64 Python wheel build job and wires version file through. |
| tools/ci_build/github/azure-pipelines/stages/plugin-webgpu-packaging-stage.yml | Wires version_file through the WebGPU packaging stage. |
| tools/ci_build/github/azure-pipelines/stages/plugin-mac-webgpu-test-stage.yml | Adds macOS wheel install + smoke test stage. |
| tools/ci_build/github/azure-pipelines/stages/plugin-mac-webgpu-stage.yml | Adds macOS wheel build job and signs the dylib artifact. |
| tools/ci_build/github/azure-pipelines/stages/plugin-linux-webgpu-test-stage.yml | Adds Linux wheel install + smoke test stage inside the WebGPU Docker image. |
| tools/ci_build/github/azure-pipelines/stages/plugin-linux-webgpu-stage.yml | Switches Linux WebGPU build to the WebGPU Docker image and adds a wheel build job. |
| tools/ci_build/github/azure-pipelines/plugin-webgpu-test-pipeline.yml | New pipeline to run smoke tests against packaging artifacts. |
| tools/ci_build/github/azure-pipelines/plugin-webgpu-pipeline.yml | Wires WebGPU plugin version file into packaging pipeline parameters. |
| plugin-ep-webgpu/python/test/test_webgpu_plugin_ep.py | Adds a smoke test for import, EP registration, device enumeration, and basic inference. |
| plugin-ep-webgpu/python/setup.py | Minimal setup to build a platform wheel with py3-none-<platform> tags. |
| plugin-ep-webgpu/python/requirements-build-wheel.txt | Adds build deps (setuptools/wheel + auditwheel/patchelf on Linux). |
| plugin-ep-webgpu/python/pyproject.toml.in | Adds Python package metadata template and package data settings. |
| plugin-ep-webgpu/python/onnxruntime_ep_webgpu/init.py | Adds helper API to locate the bundled shared library and EP names. |
| plugin-ep-webgpu/python/onnxruntime_ep_webgpu/README.md | Adds user-facing package README bundled into the wheel. |
| plugin-ep-webgpu/python/build_wheel.py | Adds the wheel build script that stages binaries, generates pyproject, and runs auditwheel on Linux. |
| plugin-ep-webgpu/python/README.md | Adds developer documentation for building/testing the wheel. |
| plugin-ep-webgpu/VERSION_NUMBER | Adds base version for pipeline-derived versioning. |
| plugin-ep-webgpu/README.md | Adds top-level documentation for the plugin packaging layout. |
| plugin-ep-webgpu/MIN_ONNXRUNTIME_VERSION | Defines minimum onnxruntime dependency for the wheel. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ed gpu_machine_pool - Pass explicit version_file: VERSION_NUMBER in plugin-linux-cuda-stage and plugin-win-cuda-stage so the now-required parameter is satisfied. - Remove unused gpu_machine_pool parameter from plugin-linux-webgpu-stage.
Add an epVersionFile pipeline variable in plugin-cuda-pipeline.yml (set to 'VERSION_NUMBER') and pass it as version_file through plugin-cuda-packaging-stage.yml down to plugin-linux-cuda-stage.yml and plugin-win-cuda-stage.yml, mirroring the pattern used by the WebGPU plugin pipeline.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 28 out of 28 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tianleiwu
left a comment
There was a problem hiding this comment.
Pull request overview
Nice extension of the existing plugin packaging flow into an onnxruntime-ep-webgpu Python wheel for Windows x64, Linux x64, and macOS arm64, plus a companion test pipeline. Factoring set-plugin-build-variables-step.yml to take a version_file parameter and splitting build vs. test pipelines is a clean reuse pattern.
Highlights
PlatformBdistWheel.get_tag+BinaryDistribution.has_ext_modules() = Trueis a clean way to produce apy3-none-{plat}wheel that ships pre-built native libraries and works across Python versions.gen_file_from_templatevalidates template variables vs. provided substitutions in strict mode — guards against drift on future renames.auditwheel_repaircorrectly excludeslibvulkan.so.1so the user's driver stack supplies the loader.- The two-stage SwiftShader Dockerfile (with
VK_ICD_FILENAMESpinned atdocker runtime, not in the image) keeps the runtime image small and reusable for a future real-GPU job, and the comment pointing to Dawn'sDEPSfor the SHA pin is a nice touch. - Smoke test disables CPU EP fallback so a passing inference actually exercises WebGPU, and skips inference gracefully on CPU-only agents.
Suggestions (non-blocking)
A couple of small robustness/maintenance items inline. Also note the two existing open threads on set-plugin-build-variables-step.yml (header comment drift to release/RC/dev; non-empty validation of version_file_rel) are still applicable on the current head.
- set-plugin-build-variables-step.yml: refresh stale 'nightly/official/dev' header comment; validate version_file parameter is non-empty before path join. - plugin-ep-webgpu/python/setup.py: switch to setuptools.command.bdist_wheel (wheel.bdist_wheel is deprecated and being removed); bump setuptools pin to >=70.1. - build_webgpu_plugin_python_package.sh: pass VERSION via PLUGIN_VERSION env var to docker run and quote it inside a single-quoted bash -c body so the inner shell sees a properly quoted argument.
Description
Motivation and Context
Put together WebGPU plugin EP Python package.