Skip to content

Commit

Permalink
[AMDGPU][LTO] Disable adding AMDGPULowerModuleLDSPass to LTO
Browse files Browse the repository at this point in the history
This is a workaround to resolve https://ontrack-internal.amd.com/browse/SWDEV-502923

The AMDGPULowerModuleLDS pass is run 2 times. Both times are run on the
combined/linked module for the entire input during LTO.

The first run happens when lto::backend executes the the optimization
pipeline (through calling opt)
(https://github.com/llvm/llvm-project/blob/main/llvm/lib/LTO/LTOBackend.cpp#L551)
and the second run happens when lto::backend later calls
codegen (https://github.com/llvm/llvm-project/blob/main/llvm/lib/LTO/LTOBackend.cpp#L558).

A crash happens in the second run because between the first and second
runs, a new GV is added to the combined module:

```
@llvm.amdgcn.kernel.__omp_offloading_fc00_600030__QQmain_l5.lds = internal addrspace(3) global %llvm.amdgcn.kernel.__omp_offloading_fc00_600030__QQmain_l5.lds.t poison, align 16, !absolute_symbol !0
```

This is the only absolute variable and it is not there yet during the
first run of AMDGPULowerModuleLDS.

In addition to the above absolute variable, we have other GVs that
remain in the combined module (during the second run of the pass):
```
@_ZZ35__kmpc_nvptx_teams_reduce_nowait_v2E14ChunkTeamCount = internal unnamed_addr addrspace(3) global i32 undef, align 4
@_ZZ35__kmpc_nvptx_teams_reduce_nowait_v2E5Bound = internal unnamed_addr addrspace(3) global i32 undef, align 4
```
which are not absolute, causing the issue (https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/AMDGPUMemoryUtils.cpp#L250).

Disussing this with Jon Chesterfield, running the pass twice is actually
incorrect (it should be run only once during code-gen, the first run of
the pass
creates
"@llvm.amdgcn.kernel.__omp_offloading_fc00_600030__QQmain_l5.lds" which
makes the second run fail). Commenting out these 2 lines "fixes" the
issue:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp#L863-L864.
We raised an issue upstream to further discuss proper fixes:
llvm#122891.
  • Loading branch information
ergawy committed Jan 14, 2025
1 parent 31f64e5 commit 8cf46a5
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 2 deletions.
10 changes: 8 additions & 2 deletions llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -856,8 +856,14 @@ void AMDGPUTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
// module is partitioned for codegen.
if (EnableSwLowerLDS)
PM.addPass(AMDGPUSwLowerLDSPass(*this));
if (EnableLowerModuleLDS)
PM.addPass(AMDGPULowerModuleLDSPass(*this));

// Most likely, adding this pass here is incorrect. Commenting out on
// ATD for now until we resolve the issue upstream. See:
// https://github.com/llvm/llvm-project/issues/122891 for the issue and
// https://ontrack-internal.amd.com/browse/SWDEV-502923?focusedId=17904500&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17904500
// for an explanation why this is likely wrong.
//if (EnableLowerModuleLDS)
// PM.addPass(AMDGPULowerModuleLDSPass(*this));
if (Level != OptimizationLevel::O0) {
// Do we really need internalization in LTO?
if (InternalizeSymbols) {
Expand Down
1 change: 1 addition & 0 deletions llvm/test/CodeGen/AMDGPU/lto-lower-module-lds.ll
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
; XFAIL: *

; Default O0
; RUN: opt -mtriple=amdgcn-- -mcpu=gfx1030 %s -o %t.bc
Expand Down

0 comments on commit 8cf46a5

Please sign in to comment.