single pass scan group virtualization #2041

sortraev · 2023-11-03T15:16:26Z

This PR adds group virtualization to SegScan.SinglePass. The generated single pass scan code now respects the suggested/requested num_groups/block_size, which in turn allows scanomaps with array construction in the map KernelBody, since mem expansion bases its expansion on this information.

Also

status flags (for the lookback step) are now initialized inside the kernel
some light code refactoring

athas

Please also add a comment somewhere in the module (it can be in the header) outlining the virtualisation strategy.

src/Futhark/CodeGen/ImpGen/GPU/SegScan.hs

src/Futhark/CodeGen/ImpGen/GPU/SegScan/SinglePass.hs

athas · 2023-11-06T21:58:42Z

There is a style issue, but worse, it also appears some of the scan benchmarks fail with the CUDA backend. (There's also a bunch of other things that fail because we are moving servers - you'll have to pick out the interesting failures from the wreckage.)

sortraev · 2023-11-08T18:03:29Z

There is a style issue, but worse, it also appears some of the scan benchmarks fail with the CUDA backend. (There's also a bunch of other things that fail because we are moving servers - you'll have to pick out the interesting failures from the wreckage.)

Yes, I see now that the benchmarks also fail when I run manually on the A100. I think the problem had to do with our change of the status flags initialization (we simply changed it to align with Cosmin's and our own prototype), but I'm not exactly sure why -- it was not a problem on our own 4090.

Anyway, I have pushed a commit which passes the benchmarks when I run them manually on the A100, so hopefully the bug has been squished (assuming this was the bug I was looking for; admittedly I don't see why it would be a bug). I will look into the other change requests and CI failures at a later time -- thanks for the comments

src/Futhark/CodeGen/ImpGen/GPU/SegScan/SinglePass.hs

Added description of virtualisation strategy, as well as two variable renamings and a number of style fixes.

Apparently I was using an older version of Ormolu.

single pass scan group virtualization

6c280fd

athas added the run-benchmarks Makes GA run the benchmark suite. label Nov 3, 2023

athas requested changes Nov 3, 2023

View reviewed changes

src/Futhark/CodeGen/ImpGen/GPU/SegScan.hs Show resolved Hide resolved

src/Futhark/CodeGen/ImpGen/GPU/SegScan/SinglePass.hs Outdated Show resolved Hide resolved

src/Futhark/CodeGen/ImpGen/GPU/SegScan/SinglePass.hs Outdated Show resolved Hide resolved

restore statusFlags initialization

5efb3a0

athas reviewed Nov 17, 2023

View reviewed changes

src/Futhark/CodeGen/ImpGen/GPU/SegScan/SinglePass.hs Outdated Show resolved Hide resolved

sortraev and others added 2 commits November 17, 2023 14:19

Address athas' change requests.

4d5979a

Added description of virtualisation strategy, as well as two variable renamings and a number of style fixes.

More style fixes.

5928623

Apparently I was using an older version of Ormolu.

athas merged commit f7a36ee into master Nov 17, 2023
24 checks passed

athas deleted the single-pass-scan-group-virt branch November 17, 2023 23:11

athas added a commit that referenced this pull request Nov 17, 2023

Note #2041.

aaf9939

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

single pass scan group virtualization #2041

single pass scan group virtualization #2041

sortraev commented Nov 3, 2023

athas left a comment

athas commented Nov 6, 2023

sortraev commented Nov 8, 2023

single pass scan group virtualization #2041

single pass scan group virtualization #2041

Conversation

sortraev commented Nov 3, 2023

athas left a comment

Choose a reason for hiding this comment

athas commented Nov 6, 2023

sortraev commented Nov 8, 2023