Skip to content

chore(examples): restructure CUDA examples and add a GEMM example #200

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 17, 2025

Conversation

adamcavendish
Copy link
Contributor

  • Refactored the CUDA examples directory for improved clarity and modularity so each example is more self-contained.
  • Added a new GEMM (General Matrix Multiply) example, including naive and tiled kernel implementations, build scripts, and benchmarks.
  • The tiled-gemm kernel demonstrates the shared memory usage.

@adamcavendish adamcavendish force-pushed the example/matmul branch 3 times, most recently from 20bd07e to 5628a8a Compare April 15, 2025 07:50
@adamcavendish
Copy link
Contributor Author

Emm ... it seems that the Windows workflow is now generating the rustc_codegen_nvvm at D:\a\Rust-CUDA\Rust-CUDA\target\debug\deps\rustc_codegen_nvvm.dll.lib but it requires a dynamic linked library rustc_codegen_nvvm.dll.

@adamcavendish adamcavendish force-pushed the example/matmul branch 3 times, most recently from fb2e300 to 1141830 Compare April 16, 2025 09:36
- Refactored the CUDA examples directory for improved clarity and
  modularity so each example is more self-contained.
- Added a new GEMM (General Matrix Multiply) example, including naive
  and tiled kernel implementations, build scripts, and benchmarks.
- The tiled-gemm kernel demonstrates the shared memory usage.
@adamcavendish
Copy link
Contributor Author

adamcavendish commented Apr 16, 2025

Emm ... it seems that the Windows workflow is now generating the rustc_codegen_nvvm at D:\a\Rust-CUDA\Rust-CUDA\target\debug\deps\rustc_codegen_nvvm.dll.lib but it requires a dynamic linked library rustc_codegen_nvvm.dll.

I found the original CI Windows pipeline has "add" disabled so maybe the issue has been around for a while.
I have the new examples disabled currently as well so it won't block the merge.

I wonder whether someone can reproduce it locally on a windows? @jorge-ortega

@jorge-ortega
Copy link
Collaborator

jorge-ortega commented Apr 16, 2025

On Windows, an import lib is generated in addition to the dll. All that does is load the actual dll when the process starts.

I tried cargo clean and then cargo build --workspace --exclude "optix*" --exclude "path-tracer" --exclude "denoiser" --exclude "cudnn*". I get several LoadLibraryEx errors, and a linker error

 = note: LINK : fatal error LNK1104: cannot open file 'C:\Users\jorge\Workspace\rust-cuda\target\debug\deps\rustc_codegen_nvvm.dll'

I think the issue might have to be with these warnings:

warning: output filename collision.
The lib target `rustc_codegen_nvvm` in package `rustc_codegen_nvvm v0.3.0 (D:\a\Rust-CUDA\Rust-CUDA\crates\rustc_codegen_nvvm)` has the same output filename as the lib target `rustc_codegen_nvvm` in package `rustc_codegen_nvvm v0.3.0 (D:\a\Rust-CUDA\Rust-CUDA\crates\rustc_codegen_nvvm)`.
Colliding filename is: D:\a\Rust-CUDA\Rust-CUDA\target\debug\rustc_codegen_nvvm.dll
The targets should have unique names.
Consider changing their names to be unique or compiling them separately.
This may become a hard error in the future; see <https://github.com/rust-lang/cargo/issues/6313>.
warning: output filename collision.
The lib target `rustc_codegen_nvvm` in package `rustc_codegen_nvvm v0.3.0 (D:\a\Rust-CUDA\Rust-CUDA\crates\rustc_codegen_nvvm)` has the same output filename as the lib target `rustc_codegen_nvvm` in package `rustc_codegen_nvvm v0.3.0 (D:\a\Rust-CUDA\Rust-CUDA\crates\rustc_codegen_nvvm)`.
Colliding filename is: D:\a\Rust-CUDA\Rust-CUDA\target\debug\deps\rustc_codegen_nvvm.dll.lib
The targets should have unique names.
Consider changing their names to be unique or compiling them separately.
This may become a hard error in the future; see <https://github.com/rust-lang/cargo/issues/6313>.

We get them on both Windows and Linux (minus the .lib). Seems cargo is building it multiple times for each binary. As soon as it finishes building it for one build script, another build of the codegen comes and wipes it before the compiler can load it. This would explain the LoadLibrary and linker error

We might need to switch to workspace dependencies so the codegen and other crates only build once for everything in the workspace.

@jorge-ortega
Copy link
Collaborator

jorge-ortega commented Apr 16, 2025

It tried switching to workspace deps, but it doesn't seem to work for build-dependencies. We might have to instead build each example in separate build commands, so they don't interfere with each other.

@adamcavendish
Copy link
Contributor Author

It tried switching to workspace deps, but it doesn't seem to work for build-dependencies. We might have to instead build each example in separate build commands, so they don't interfere with each other.

Get it. What about merging this change first since previously we have such issue too?

@jorge-ortega jorge-ortega merged commit 290d711 into Rust-GPU:main Apr 17, 2025
5 checks passed
@jorge-ortega
Copy link
Collaborator

@LegNeato FYI, since you've been working on the modal CI. Hope this won't interfere much.

@LegNeato
Copy link
Contributor

Thanks! Don't worry about me, I'll keep rebasing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants