Support for allowing direct VEXTRACT to 20-bit registers #233

abhinay-anubola · 2024-11-08T09:47:21Z

This update introduces a new generic combiner that simplifies the sequence sext(trunc x) directly to x when applicable.
Added VExtract combiner that enables above generic combiner, thus we have 20-bit vextract.
The MachineVerifier has been updated to allow G_AIE_SEXT_EXTRACT_VECTOR_ELT and G_AIE_ZEXT_EXTRACT_VECTOR_ELT to accept 20-bit outputs.
Additionally, tests have been added and updated to reflect these functional changes.

llvm/lib/Target/AIE/AIECombinerHelper.cpp

llvm/test/CodeGen/AIE/aie2/GlobalISel/prelegalizercombiner-s20-narrowing.mir

krishnamtibrewala · 2024-11-08T17:51:23Z

Given that you mentioned there are no QoR gain, I would recommend you to re look at the instruction that consume S20 type reg.
Because for the optimization starts to trace back from an instruction that consumes S20 type which might not be captured in isNativeS20Consumer function.

llvm/lib/Target/AIE/AIECombinerHelper.cpp

llvm/lib/Target/AIE/AIE2PreLegalizerCombiner.cpp

llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp

andcarminati · 2024-12-16T13:48:46Z

llvm/lib/Target/AIE/AIE2PreLegalizerCombiner.cpp

+  MIRBuilder.buildAssertInstr(AssertExtOpcode, ExtReg20Bit, DstReg20Bit,
+                              SrcEltSize);
+  MIRBuilder.buildInstr(ExtOpcode, {DstReg}, {ExtReg20Bit});
+  MI.eraseFromParent();


Now we are safe ;-)

llvm/lib/Target/AIE/AIE2PreLegalizerCombiner.cpp

gbossu · 2024-12-30T09:58:57Z

llvm/test/CodeGen/RISCV/GlobalISel/jumptable.ll

@@ -192,7 +190,6 @@ define void @above_threshold(i32 signext %in, ptr %out) nounwind {
 ; RV64I-PIC-LABEL: above_threshold:
 ; RV64I-PIC:       # %bb.0: # %entry
 ; RV64I-PIC-NEXT:    li a2, 5
-; RV64I-PIC-NEXT:    sext.w a0, a0


Nice! Did you check all targets?

Yes, I have checked all targets.

gbossu · 2024-12-30T10:10:19Z

llvm/lib/Target/AIE/AIE2PreLegalizerCombiner.cpp

+/// To :   %9:_(s20) = G_AIE_SEXT_EXTRACT_VECTOR_ELT %2(<32 x s16>), %0(s32)
+///        %10:_(s20) = G_ASSERT_[S/Z]EXT %9, 16
+///        %4:_(s16) = G_TRUNC %10(s20)
+///        %5:_(s20) = G_[S/Z]EXT %4(s16)


Do we need to change the return types? I would expect that we only need to add a %10:_(s32) = G_ASSERT_[S/Z]EXT %9, 16 and keep the rest intact thanks to the new sext(trunc x) combiner you added previously.

Yes, we need to change the return types, because the pattern that is written in new sext(trunc x) combiner will not match in this case as m_SpecificType is trying to match s20 but return type here is s32.
mi_match(SrcReg, MRI, m_GTrunc(m_all_of(m_Reg(Reg), m_SpecificType(DstTy))))

andcarminati

LGTM.

konstantinschwarz · 2025-02-20T15:35:43Z

llvm/test/CodeGen/AMDGPU/GlobalISel/combine-sext-trunc.mir

+    %var:_(s32) = COPY $vgpr0
+    %assert:_(s32) = G_ASSERT_SEXT %var, 16
+    %trunc:_(s16) = G_TRUNC %assert(s32)
+    %sext:_(s64) = G_SEXT %trunc(s16)


Nit: I guess we could still eliminate the G_TRUNC, and change G_SEXT to extend from 32 -> 64?

konstantinschwarz · 2025-02-20T15:44:04Z

llvm/lib/Target/AIE/aie2p/AIE2PInstrInfo.cpp

-    ErrInfo = "Expected 32/64bit scalar destination";
-    return MRI.getType(MI.getOperand(0).getReg()) == LLT::scalar(32) ||
-           MRI.getType(MI.getOperand(0).getReg()) == LLT::scalar(64);
+    ErrInfo = "Expected 20/32/64bit scalar destination";


nit: we haven't changed anything for AIE2p, we still expect only 32-/64-bit destination types

konstantinschwarz · 2025-02-20T15:53:50Z

llvm/lib/Target/AIE/AIE2PreLegalizerCombiner.cpp

+  const LLT S20 = LLT::scalar(20);
+  Register DstReg20Bit = MRI.createGenericVirtualRegister(S20);
+  Register ExtReg20Bit = MRI.createGenericVirtualRegister(S20);
+  MachineIRBuilder MIRBuilder(MI);


The Combiner class already has a MachineIRBuilder member B, we should use it here to make us of potential CSE opportunities: initialize as B.setInstrAndDebugLoc(MI); and then use below

konstantinschwarz · 2025-02-20T15:59:25Z

llvm/lib/Target/AIE/AIE2InstrInfo.cpp

@@ -159,8 +159,9 @@ bool AIE2InstrInfo::verifyGenericInstruction(const MachineInstr &MI,
        return false;
      }
    }
-    ErrInfo = "Expected 32bit scalar destination";
-    return MRI.getType(MI.getOperand(0).getReg()) == LLT::scalar(32);
+    ErrInfo = "Expected 20/32bit scalar destination";


Can you please add instruction selection tests for s20 = G_AIE_[SZ]EXT_EXTRACT_VECTOR_ELT now that we made these new types legal?

abhinay-anubola requested review from abnikant, andcarminati, gbossu, khallouh, konstantinschwarz, martien-de-jong, SagarMaheshwari99 and stephenneuendorffer as code owners November 8, 2024 09:47

andcarminati reviewed Nov 8, 2024

View reviewed changes

llvm/lib/Target/AIE/AIECombinerHelper.cpp Outdated Show resolved Hide resolved

krishnamtibrewala reviewed Nov 8, 2024

View reviewed changes