Skip to content

Add AMD GPU (HIP) backend support#50

Merged
robtaylor merged 13 commits intomainfrom
amd-support
Feb 27, 2026
Merged

Add AMD GPU (HIP) backend support#50
robtaylor merged 13 commits intomainfrom
amd-support

Conversation

@robtaylor
Copy link
Contributor

Summary

  • Add HIP (Heterogeneous-compute Interface for Portability) backend as a third GPU target alongside CUDA and Metal
  • Share kernel_v1_impl.cuh between CUDA and HIP with minimal #ifdef guards — RDNA wave32 matches CUDA's warp size so kernel logic ports directly
  • Add cl_hip() to ucc build system with NVIDIA backend support (HIP_PLATFORM=nvidia)
  • Add minimal raw FFI bindings for HIP runtime (hipMalloc, hipFree, hipMemcpy, etc.) — no external Rust crate needed
  • Add Device::HIP(u8) variant to ulib with full UVec<T> allocation/copy/sync support
  • Add sim_hip() dispatch in loom binary with automatic device selection
  • Add CI job that validates the HIP code path on the NVIDIA runner via HIP_PLATFORM=nvidia

New files

File Description
vendor/eda-infra-rs/ulib/src/hip_ffi.rs Raw HIP runtime FFI bindings
vendor/eda-infra-rs/ulib/csrc/memfill.hip.cpp HIP GPU memfill kernels
csrc/kernel_v1.hip.cpp HIP kernel launch wrapper

Build & test

# Build (requires hipcc / ROCm SDK)
cargo build -r --features hip --bin loom

# Run with CPU cross-check
cargo run -r --features hip --bin loom -- sim \
    design.gv design.gemparts input.vcd output.vcd NUM_BLOCKS \
    --check-with-cpu

CPU-only, Metal, and benchmark builds are verified unaffected.

Test plan

  • CI hip-on-nvidia job passes (HIP compiled with NVIDIA backend on existing GPU runner)
  • CPU-only build regression: cargo build -r --bin loom
  • Metal build regression: cargo build -r --features metal --bin loom
  • Benchmarks unaffected: cargo bench --bench event_buffer
  • Native AMD GPU test (when hardware available): run with --check-with-cpu for correctness validation

Add a third GPU backend targeting AMD GPUs via HIP (ROCm). RDNA GPUs
use wave32 matching CUDA's warp size, so the existing kernel logic in
kernel_v1_impl.cuh is shared between CUDA and HIP with minimal #ifdef
guards. The cooperative kernel launch pattern is also identical.

Changes:
- kernel_v1_impl.cuh: conditional include for HIP cooperative_groups
- event_buffer.h: recognize __HIP_PLATFORM_AMD__ alongside __CUDACC__
- kernel_v1.hip.cpp: HIP kernel launch wrapper with warp size assertion
- Cargo.toml: add hip = ["ulib/hip"] feature
- build.rs: compile kernel_v1.hip.cpp with hipcc, link amdhip64
- loom.rs: add sim_hip() function and #[cfg(feature = "hip")] dispatch
- vendor/eda-infra-rs: HIP support in ucc (cl_hip, bindgen) and ulib
  (Device::HIP, HipBuffer, FFI bindings, memfill.hip.cpp)
- CLAUDE.md: document HIP build commands and workflow

The hip feature is fully optional. Existing CUDA/Metal/CPU-only builds
are unaffected.

Co-developed-by: Claude Code v2.1.44 (claude-opus-4-6)
Add a hip-on-nvidia CI job that validates the HIP code path on
the existing nvidia-runner-1 using HIP's NVIDIA backend. This
installs ROCm/HIP packages alongside the CUDA toolkit and runs
the same timing test and X-propagation tests as the CUDA job.

Also update cl_hip() in ucc to detect HIP_PLATFORM=nvidia and
skip AMD-specific --offload-arch flags (which would fail with
nvcc). Supports UCC_HIP_TARGETS=none for explicit opt-out.

Co-developed-by: Claude Code v2.1.44 (claude-opus-4-6)
Use apt-key add (legacy but reliable) plus keyserver fallback
for the specific signing key. The signed-by approach with
dearmored keys failed on Ubuntu 24.04 (noble).

Co-developed-by: Claude Code v2.1.44 (claude-opus-4-6)
hip-runtime-nvidia pulls in nvidia-kernel-common which conflicts
with the GPU runner's existing nvidia-kernel-common-570-server.
Hold all existing nvidia-* packages before installing ROCm/HIP
packages to prevent apt from trying to resolve driver conflicts.

Co-developed-by: Claude Code v2.1.44 (claude-opus-4-6)
The hip-runtime-nvidia package depends on nvidia driver packages
that conflict with the -570-server variants on the GPU runner.
Use apt download + dpkg --force-depends to install just the HIP
packages we need without resolving the full driver dependency tree.

Co-developed-by: Claude Code v2.1.44 (claude-opus-4-6)
apt-get download without sudo cannot read root-owned apt config
files (Permission denied on rocm.list). Use sudo and a temp dir.

Co-developed-by: Claude Code v2.1.44 (claude-opus-4-6)
apt-get download fails completely if any listed package is missing.
Download each HIP package individually with fallback for packages
that don't exist in the repo (hip-runtime-nvidia-dev, hip-base).

Co-developed-by: Claude Code v2.1.44 (claude-opus-4-6)
…ions

Three fixes for HIP backend:

1. cl_hip() NVIDIA backend: generate a wrapper script around hipcc that
   strips -ffunction-sections/-fdata-sections before forwarding to hipcc.
   The cc crate adds these for clang-family compilers, but when hipcc
   wraps nvcc (HIP_PLATFORM=nvidia) they get passed through and nvcc
   rejects them.

2. Fix swapped hipMemcpy direction constants in copy() — (HIP,CPU) is
   host-to-device, (CPU,HIP) is device-to-host.

3. Refactor kernel_v1.hip.cpp warp size validation to a one-time check
   instead of checking on every kernel launch.

Co-developed-by: Claude Code v2.1.44 (claude-opus-4-6)
The cc crate adds -Wall/-Wextra for clang-family compilers. hipcc
passes them directly to nvcc which rejects them. Disable cc-crate
warnings and add -Xcompiler -Wall manually in the wrapper.

Co-developed-by: Claude Code v2.1.44 (claude-opus-4-6)
The linker couldn't find libamdhip64 because /opt/rocm/lib wasn't in
the search path. Add cargo:rustc-link-search in both build.rs files
using ROCM_PATH env var (defaulting to /opt/rocm). Also add
LD_LIBRARY_PATH and LIBRARY_PATH to CI environment.

Co-developed-by: Claude Code v2.1.44 (claude-opus-4-6)
On NVIDIA backend, libamdhip64.so doesn't exist — HIP functions are
header-only CUDA wrappers. Compile thin hip_ffi_* wrapper functions
with hipcc so they resolve to the correct runtime regardless of
platform. Link amdhip64 on AMD, cudart on NVIDIA.

Co-developed-by: Claude Code v2.1.44 (claude-opus-4-6)
When HIP_PLATFORM=nvidia, we link cudart instead of amdhip64. Add
CUDA_PATH/lib64 to the linker search path so the linker can find
libcudart on the CI runner.

Co-developed-by: Claude Code v2.1.44 (claude-opus-4-6)
hipDeviceGetAttribute wraps cuDeviceGetAttribute from libcuda.so.
Add cuda driver library to link list and stubs/ to search path.

Co-developed-by: Claude Code v2.1.44 (claude-opus-4-6)
@robtaylor robtaylor merged commit 01bb4eb into main Feb 27, 2026
9 checks passed
@robtaylor robtaylor deleted the amd-support branch February 27, 2026 22:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant