Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU] Fix unreachable reg bit width #122107

2 changes: 2 additions & 0 deletions llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2487,6 +2487,7 @@ unsigned getRegBitWidth(unsigned RCID) {
case AMDGPU::AReg_128_Align2RegClassID:
case AMDGPU::AV_128RegClassID:
case AMDGPU::AV_128_Align2RegClassID:
case AMDGPU::SReg_128_XNULLRegClassID:
return 128;
case AMDGPU::SGPR_160RegClassID:
case AMDGPU::SReg_160RegClassID:
Expand Down Expand Up @@ -2523,6 +2524,7 @@ unsigned getRegBitWidth(unsigned RCID) {
case AMDGPU::AReg_256_Align2RegClassID:
case AMDGPU::AV_256RegClassID:
case AMDGPU::AV_256_Align2RegClassID:
case AMDGPU::SReg_256_XNULLRegClassID:
return 256;
case AMDGPU::SGPR_288RegClassID:
case AMDGPU::SReg_288RegClassID:
Expand Down
15 changes: 15 additions & 0 deletions llvm/test/CodeGen/AMDGPU/add-xnull-regclass-bitwidth.mir
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename test file

# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -run-pass=postmisched -o - %s | FileCheck %s
---
name: test_xnull_256
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the 128 case.

Copy link
Contributor Author

@Shoreshen Shoreshen Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @arsenm , to trigger the unreachable during postmisched pass it has to be MIMG instruction, but I cannot find MIMG instruction uses SReg_128_XNULL

I also tried to find other places that may use the bit width function:

  1. used in selectCOPY
  2. used in foldOperand, but for immediate only
  3. used in canInsertSelect test if it is ok to insert a select instruction
  4. used in buildSpillLoadStore but it seems like the target registers are all caller/callee saved regs
  5. used in printRegularOperand but only for Op.isDFPImm()
  6. used in getRegOperandSize, which is used in validateMIMGAddrSize and validateMIMGDataSize

The only possibility is getRegSplitParts, but it is used by many function, I need help on this since I do not familiar with the fucntion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect the selectCOPY case would be most straightforward

Copy link
Contributor Author

@Shoreshen Shoreshen Jan 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @arsenm , to successfully trigger bit width function in selectCOPY, the dst operand must:

  1. pass isVCC function for the dst register of COPY
  2. fails isVCC function for src register of the COPY

I think the bit width of src and dst must be the same, otherwise the copy mismatch type verification error will trigger

Thus to use SReg_128_XNULL as src reg class, the dst must also be 128 bit width

To pass the isVCC function for dst register, it must be:

  1. Non physical register (which I cannot simply use $vcc as dst)
  2. If it is assigned register class, the bit width must be 1
  3. If it is register bank, the bank id must equals to AMDGPU::VCCRegBankID

For COPY not being assigned for reg class, I think the dst of COPY must not be used for any instruction.

Because each instruction's each input should have a reg class bind with it, and ti will try to assign reg class accordingly.

But if COPY is not used by any instruction, then it will not be selected, since it most probably will not pass !isTriviallyDead(MI, MRI) function check.

So it seems like I ran out of my ways to produce a 128 case, would there be any other possibilities?

Thanks a lot!


BTW, I also tried to remove the AMDGPU::getRegBitWidth(SrcRC->getID()) == 16 in selectCOPY and tried to find any test failed after then. but it seems even without the judgement, all tests passed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original problem was triggered by SReg_256_XNULL. So maybe we just fix it for that reg class?

body: |
bb.0:
; CHECK-LABEL: name: test_xnull_256
; CHECK: IMAGE_STORE_V4_V2_gfx90a $vgpr0_vgpr1_vgpr2_vgpr3, killed $vgpr8_vgpr9, killed $sgpr24_sgpr25_sgpr26_sgpr27_sgpr28_sgpr29_sgpr30_sgpr31, 15, -1, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store (s128), addrspace 8)
; CHECK-NEXT: $vgpr2 = V_LSHRREV_B32_e32 4, killed $vgpr2, implicit $exec
IMAGE_STORE_V4_V2_gfx90a $vgpr0_vgpr1_vgpr2_vgpr3, $vgpr8_vgpr9, $sgpr24_sgpr25_sgpr26_sgpr27_sgpr28_sgpr29_sgpr30_sgpr31, 15, -1, 0, 0, 0, 0, 0, implicit $exec :: (dereferenceable store (s128), addrspace 8)
$vgpr2 = V_LSHRREV_B32_e32 4, $vgpr2, implicit $exec
...


# FIXME: We need xnull_128 test case (which reach unreachable in function AMDGPU::getRegBitWidth). Currently cannot find one
Loading