Fix XPU backend availability by importing IPEX #149
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
When running
kmeans_aces_xpu.slurmwith XPU devices,torch.xpu.is_available()returnsFalseeven though XPU devices are properly allocated by SLURM:Root Cause
Intel Extension for PyTorch (IPEX) must be imported before any XPU operations to register the XPU backend with PyTorch's device system. Without this import:
torch.xpumodule exists but is not fully initializedtorch.xpu.is_available()returnsFalseThis is a requirement of Intel's XPU backend that's not obvious from the documentation.
Solution
Added lazy IPEX import to
core/device.py:_ensure_ipex_imported(): New helper function that imports IPEX on first useModified
get_backend(): Calls_ensure_ipex_imported()before returningth.xpuModified
get_device(): Calls_ensure_ipex_imported()for XPU devicesThis fix ensures the XPU backend is properly initialized whenever XPU functionality is accessed, making
torch.xpu.is_available()returnTruewhen XPU devices are present.Testing
All checks passed:
uv run ruff check .- No linting errorsuv run ruff format .- Code formatted correctlyReferences