[AUTOGENERATED] [release/2.6] [release/2.5][ROCm] Indexing backward kernel improvements from mainline #1942

rocm-mici · 2025-03-04T18:17:30Z

Cherry-pick of #1937
The first 2 commits are already in 2.6 branch

) This patch makes several changes to the stride 1 backwards indexing kernel as follows: - enables the computation across the `sorted_indices` array to happen in parallel by all the lanes in the warp, this means that the accesses to `sorted_indices` are now fully coalesced. - the duplicate counting now happens in parallel: each lane in the warp counts the duplicates of a different `idx`. - enable skipping during duplicate count: this optimization ensures that for large number of duplicates we can skip 32 values at time to speed up the count. - for low number of duplicates i.e. we have less than `warp-size` duplicates then just perform the tail reduction which avoid the wasteful parallel reduction across the warp for this case (it would only add zero values). - for high number of duplicates i.e. when we have more than `warp-size` duplicates then we still use the full warp of lanes to compute the reduced value with as much parallelism as possible. This is done by making sure that all lanes stick around and cooperatively execute the reduction in case there is a single `idx` which has a large number of duplicates (i.e. a duplicate spike). For this to happen we use shared memory to pass the duplicate count computed in parallel in the first part of the kernel to the cooperative reduction part of the kernel. Benefits on examples extracted from workloads show a 3.6x to 10x speed-up. co-author: Hashem Hashemi <[email protected]> Pull Request resolved: pytorch#146420 Approved by: https://github.com/pruthvistony, https://github.com/jeffdaily

rocm-repo-management-api · 2025-03-04T18:25:51Z

Jenkins build for 8448168b8c9cd6c2b8f1bc2db23b2b533875748b commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

rocm-repo-management-api · 2025-03-05T17:20:42Z

Jenkins build for 8448168b8c9cd6c2b8f1bc2db23b2b533875748b commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

pruthvistony · 2025-03-05T17:41:12Z

@jerrymannil ,
What is the reason to keep this in draft?

rocm-mici mentioned this pull request Mar 4, 2025

[release/2.5][ROCm] Indexing backward kernel improvements from mainline (mutiple commits) #1937

Merged

jerrymannil requested a review from pruthvistony March 4, 2025 22:46

pruthvistony approved these changes Mar 5, 2025

View reviewed changes

jerrymannil marked this pull request as ready for review March 11, 2025 17:06

jerrymannil merged commit 92b55d0 into release/2.6 Mar 11, 2025
0 of 4 checks passed

jerrymannil deleted the release/2.6_cherry-pick_pr-1937 branch March 11, 2025 17:06

jerrymannil changed the title ~~[AUTOGENERATED] [release/2.6] [release/2.5][ROCm] Indexing backward kernel improvements from mainline (mutiple commits)~~ [AUTOGENERATED] [release/2.6] [release/2.5][ROCm] Indexing backward kernel improvements from mainline Mar 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AUTOGENERATED] [release/2.6] [release/2.5][ROCm] Indexing backward kernel improvements from mainline #1942

[AUTOGENERATED] [release/2.6] [release/2.5][ROCm] Indexing backward kernel improvements from mainline #1942

Uh oh!

rocm-mici commented Mar 4, 2025 •

edited by jerrymannil

Loading

Uh oh!

rocm-repo-management-api bot commented Mar 4, 2025 •

edited

Loading

Uh oh!

rocm-repo-management-api bot commented Mar 5, 2025 •

edited

Loading

Uh oh!

pruthvistony commented Mar 5, 2025

Uh oh!

Uh oh!

Uh oh!

[AUTOGENERATED] [release/2.6] [release/2.5][ROCm] Indexing backward kernel improvements from mainline #1942

[AUTOGENERATED] [release/2.6] [release/2.5][ROCm] Indexing backward kernel improvements from mainline #1942

Uh oh!

Conversation

rocm-mici commented Mar 4, 2025 • edited by jerrymannil Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api bot commented Mar 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocm-repo-management-api bot commented Mar 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pruthvistony commented Mar 5, 2025

Uh oh!

Uh oh!

Uh oh!

rocm-mici commented Mar 4, 2025 •

edited by jerrymannil

Loading

rocm-repo-management-api bot commented Mar 4, 2025 •

edited

Loading

rocm-repo-management-api bot commented Mar 5, 2025 •

edited

Loading