@@ -83,7 +83,7 @@ To handle communication between our code on the CPU and GPU, we'll use
83
83
implements the WebGPU API. On the web, it works directly with the browser's WebGPU
84
84
implementation. On native platforms, it translates API calls to the platform's GPU API
85
85
(Vulkan, DirectX, or Metal). This lets us run the same code on a wide range of
86
- platforms, including Windows, Linux, macOS, iOS[ ^ 1 ] , Android, and the web[ ^ 2 ] .
86
+ platforms, including Windows, Linux, macOS[ ^ 1 ] , iOS[ ^ 2 ] , Android, and the web[ ^ 3 ] .
87
87
88
88
By using Rust GPU and ` wgpu ` , we have a clean, portable setup with everything written in
89
89
Rust.
@@ -147,9 +147,9 @@ There are a couple of things to note about the Rust implementation:
147
147
4 . The inner loop (` for i in 0..dimensions.k ` ) uses Rust's ` for ` syntax with a range.
148
148
This is a higher-level abstraction compared to manually iterating with an index in
149
149
other shader languages like WGSL, GLSL, or HLSL.
150
- 5 . Read-only inputs are immutable references (` &Dimensions ` / ` &[f32] ` ) and writeable outputs are
151
- mutable references (` &mut [f32] ` ). This feels very familiar to anyone used to writing
152
- Rust.
150
+ 5 . Read-only inputs are immutable references (` &Dimensions ` / ` &[f32] ` ) and writable
151
+ outputs are mutable references (` &mut [f32] ` ). This feels very familiar to anyone
152
+ used to writing Rust.
153
153
154
154
#### What's with all the ` usize ` ?
155
155
@@ -181,7 +181,7 @@ Each workgroup, since it's only one thread (`#[spirv(compute(threads(1)))]`), pr
181
181
one ` result[i, j] ` .
182
182
183
183
To calculate the full matrix, we need to launch as many entries as there are in the
184
- matrix. Here we specify that (` Uvec3::new(m * n, 1, 1 ` ) on the CPU:
184
+ ` m * n ` matrix. Here we specify that (` Uvec3::new(m * n, 1, 1 ` ) on the CPU:
185
185
186
186
import { RustNaiveWorkgroupCount } from './snippets/naive.tsx';
187
187
@@ -308,6 +308,14 @@ complete runnable code can be [found on
308
308
GitHub] ( https://github.com/Rust-GPU/rust-gpu.github.io/tree/main/blog/2024-11-21-optimizing-matrix-mul/code )
309
309
and you can run the benchmarks yourself with ` cargo bench ` .
310
310
311
+ ::: tip
312
+
313
+ You can also check out real-world projects using Rust GPU such as
314
+ [ ` autograph ` ] ( https://github.com/charles-r-earp/autograph ) and
315
+ [ ` rederling ` ] ( https://renderling.xyz/ ) .
316
+
317
+ :::
318
+
311
319
## Reflections on porting to Rust GPU
312
320
313
321
Porting to Rust GPU went quickly, as the kernels Zach used were fairly simple. Most of
@@ -320,9 +328,11 @@ is not _great_ as it is still blog post code!
320
328
321
329
My background is not in GPU programming, but I do have Rust experience. I joined the
322
330
Rust GPU project because I tried to use standard GPU languages and knew there must be a
323
- better way. Writing these GPU kernels felt like writing any other Rust code (other than
324
- debugging, more on that later) which is a huge win to me. Not just the language itself,
325
- but the entire development experience.
331
+ better way.
332
+
333
+ Writing these GPU kernels felt like writing any other Rust code (other than debugging,
334
+ more on that later) which is a huge win to me. Not just the language itself, but the
335
+ entire development experience.
326
336
327
337
## Rust-specific party tricks
328
338
@@ -372,10 +382,10 @@ bug I couldn't figure out. GPU debugging tools are limited and `printf`-style de
372
382
often isn't available. But what if we could run the GPU kernel _ on the CPU_ , where we
373
383
have access to tools like standard debuggers and good ol' ` printf ` /` println ` ?
374
384
375
- With Rust GPU, this was straightforward. By using ` cfg() ` directives I made the
376
- GPU-specific annotations (` #[spirv(...)] ` ) disappear when compiling for the CPU. The
377
- result? The kernel became a regular Rust function. On the GPU, it behaves like a shader.
378
- On the CPU, it's just a function you can call directly.
385
+ With Rust GPU, this was straightforward. By using standard Rust ` cfg() ` directives I
386
+ made the GPU-specific annotations (` #[spirv(...)] ` ) disappear when compiling for the
387
+ CPU. The result? The kernel became a regular Rust function. On the GPU, it behaves like
388
+ a shader. On the CPU, it's just a function you can call directly.
379
389
380
390
Here's what it looks like in practice using the 2D tiling kernel from before:
381
391
@@ -404,7 +414,7 @@ Testing the kernel in isolation is useful, but it does not reflect how the GPU e
404
414
it with multiple invocations across workgroups and dispatches. To test the kernel
405
415
end-to-end, I needed a test harness that simulated this behavior on the CPU.
406
416
407
- Building the harness was straightforward due to the borrow checker . By enforcing the
417
+ Building the harness was straightforward due to due to Rust . By enforcing the
408
418
same invariants as the GPU I could validate the kernel under the same conditions the GPU
409
419
would run it:
410
420
@@ -450,7 +460,7 @@ other Rust project.
450
460
451
461
This required no new tools or workflows. The tools I already knew worked seamlessly.
452
462
More importantly, this approach benefits anyone working on the project. Any Rust
453
- engineer can run these benchmarks with no additional setup-- ` cargo bench ` is a standard
463
+ engineer can run these benchmarks with no additional setup— cargo bench` is a standard
454
464
part of the Rust ecosystem.
455
465
456
466
### Lint
@@ -517,9 +527,9 @@ and `f64` without duplicating code, all while maintaining type safety and perfor
517
527
### Error handling with ` Result `
518
528
519
529
Rust GPU also supports error handling using ` Result ` . Encoding errors in the type system
520
- makes it clear where things can go wrong and forces developers to handle those cases.
521
- This is particularly useful for validating kernel inputs or handling the many edge cases
522
- in GPU logic.
530
+ makes it clear where things can go wrong and forces you to handle those cases. This is
531
+ particularly useful for validating kernel inputs or handling the many edge cases in GPU
532
+ logic.
523
533
524
534
### Iterators
525
535
@@ -535,12 +545,13 @@ future.
535
545
536
546
### Conditional compilation
537
547
538
- This kernel doesn't use conditional compilation, but it's a key feature of Rust that
539
- works with Rust GPU. With ` #[cfg(...)] ` , you can adapt kernels to different hardware or
540
- configurations without duplicating code. GPU languages like WGSL or GLSL offer
541
- preprocessor directives, but these tools lack standardization across projects. Rust GPU
542
- leverages the existing Cargo ecosystem, so conditional compilation follows the same
543
- standards all Rust developers already know.
548
+ While I briefly touched on it a couple of times, this kernel doesn't really show the
549
+ full power of conditional compilation. With ` #[cfg(...)] ` and [ cargo
550
+ "features"] ( https://doc.rust-lang.org/cargo/reference/features.html ) , you can adapt
551
+ kernels to different hardware or configurations without duplicating code. GPU languages
552
+ like WGSL or GLSL offer preprocessor directives, but these tools lack standardization
553
+ across projects. Rust GPU leverages the existing Cargo ecosystem, so conditional
554
+ compilation follows the same standards all Rust developers already know.
544
555
545
556
## Come join us!
546
557
@@ -551,7 +562,8 @@ or get involved, check out the [`rust-gpu` repo on
551
562
GitHub] ( https://github.com/rust-gpu/rust-gpu ) .
552
563
<br />
553
564
554
- [ ^ 1 ] : Via [ MoltenVK] ( https://github.com/KhronosGroup/MoltenVK )
555
- [ ^ 2 ] :
556
- Technically ` wgpu ` translates SPIR-V to GLSL or WGSL via
557
- [ naga] ( https://github.com/gfx-rs/wgpu/tree/trunk/naga )
565
+ [ ^ 1 ] : Technically ` wgpu ` uses [ MoltenVK] ( https://github.com/KhronosGroup/MoltenVK ) or translates to Metal on macOS
566
+ [ ^ 2 ] : Technically ` wgpu ` uses [ MoltenVK] ( https://github.com/KhronosGroup/MoltenVK ) or translates to Metal on iOS
567
+ [ ^ 3 ] :
568
+ Technically ` wgpu ` translates SPIR-V to GLSL (WebGL) or WGSL (WebGPU) via
569
+ [ naga] ( https://github.com/gfx-rs/wgpu/tree/trunk/naga ) on the web
0 commit comments