[AIE2P] Legalize and select VMUL.f from G_FMUL #360

khallouh · 2025-02-17T14:51:01Z

No description provided.

llvm/lib/Target/AIE/AIELegalizerHelper.cpp

llvm/test/CodeGen/AIE/aie2p/GlobalIsel/legalize-fmul.mir

martien-de-jong · 2025-02-21T09:23:04Z

llvm/lib/Target/AIE/aie2p/AIE2PInstrPatterns.td

@@ -40,6 +40,8 @@ class VecConf {
  int BMODE_16x16_b = 1;
  int BMODE_32x16   = 0;


Funny to have aliases here.

martien-de-jong · 2025-02-21T09:23:09Z

llvm/lib/Target/AIE/aie2p/AIE2PInstrPatterns.td

@@ -40,6 +40,8 @@ class VecConf {
  int BMODE_16x16_b = 1;
  int BMODE_32x16   = 0;

+  int VARIANT_BF16xBF16_1_elem_1 = 1;


Sounds as if there are more variants. List them all in one go?

I could but I'm not sure if we will ever be able to use all of them them in any patterns.

It looks like the translation of a hardware enumeration into tablegen speak. I'm hoping that one day we'll have a single point of definition for these, and the full list would make them more recognisable.

martien-de-jong · 2025-02-21T09:55:09Z

llvm/lib/Target/AIE/aie2p/AIE2PInstrPatterns.td

@@ -59,6 +61,7 @@ class VecConf {
 }

 def accfp32_vecconf : VecConf { let amode = AMODE_FP32; let bmode = BMODE_16x16; }
+def mulbf16_vecconf : VecConf { let amode = AMODE_FP32; let bmode = BMODE_16x16; let cmode = VARIANT_BF16xBF16_1_elem_1; }


Since this is a local definition, I wouldn't mind using CMODE as prefix.

martien-de-jong · 2025-02-21T09:57:34Z

llvm/lib/Target/AIE/aie2p/AIE2PInstrPatterns.td

+                sub_1024_acc_hi)), 
+            sub_512_hi))>;
+
+def : Pat<(v32bf16 (fmul v32bf16:$vec1, v32bf16:$vec2)),


This isn't a standard legalization?

For this case, I don't know any but for the wider v64bf16 case above we could possibly use .fewerElements to keep only one pattern. I will try it.

niwinanto · 2025-02-21T10:01:22Z

llvm/lib/Target/AIE/aie2p/AIE2PLegalizerInfo.cpp

@@ -225,12 +225,17 @@ AIE2PLegalizerInfo::AIE2PLegalizerInfo(const AIE2PSubtarget &ST)

  getActionDefinitionsBuilder(G_FABS).customFor({S16, S32, S64}).scalarize(0);

+  getActionDefinitionsBuilder(G_FMUL)
+      .legalFor({V64S16, V32S16})
+      .customFor({S16})


Do we need to retain .clampScalar?

We have custom legalization for S16 now, no need to clamp it to S32/S64. Any other scalar should be illegal

I mean .clampScalar(0, S16, S64)

but why? the only float type under 16 bits we have is bfloat (aka S16)

True. Just pointing that we deviate from old behavior, s128 to s64 or s8 to s32. But you are right, it does not make sense for these types.

niwinanto · 2025-02-21T10:10:33Z

llvm/test/CodeGen/AIE/GlobalISel/legalize-fmul.mir

@@ -5,7 +5,6 @@
 # (c) Copyright 2024 Advanced Micro Devices, Inc. or its affiliates

 # RUN: llc -mtriple aie2 -run-pass=legalizer %s -verify-machineinstrs -o - | FileCheck -DVER=2 --check-prefix=COMMON --check-prefix=AIE2 %s
-# RUN: llc -mtriple aie2p -run-pass=legalizer %s -verify-machineinstrs -o - | FileCheck -DVER=2p --check-prefix=COMMON --check-prefix=AIE2P %s


We still have AIE2P checkline in the test. You could also remove -DVER=2 --check-prefix=COMMON

niwinanto · 2025-02-21T10:11:52Z

llvm/test/CodeGen/AIE/aie2p/GlobalIsel/legalize-fmul.mir

+#
+# (c) Copyright 2024 Advanced Micro Devices, Inc. or its affiliates
+
+# RUN: llc -mtriple aie2p -run-pass=legalizer %s -verify-machineinstrs -o - | FileCheck %s


Would be nice to include the libcall tests as well.

We already didn't have them but I will add them while at it.

niwinanto · 2025-02-21T10:34:16Z

llvm/lib/Target/AIE/aie2p/AIE2PLegalizerInfo.cpp

@@ -225,12 +225,17 @@ AIE2PLegalizerInfo::AIE2PLegalizerInfo(const AIE2PSubtarget &ST)

  getActionDefinitionsBuilder(G_FABS).customFor({S16, S32, S64}).scalarize(0);

+  getActionDefinitionsBuilder(G_FMUL)
+      .legalFor({V64S16, V32S16})
+      .customFor({S16})


It would be nice to have a comment to explain why we would customize this for s16. I dont really get the context here.

We don't have an instruction to multiply bf16 scalars, so instead of using an inefficient and potentially unsafe libcall (e.g. in the case of hardware loops) we need custom legalization by inserting the bf16 scalar into a vector, perform the element wise multiplication with VMUL.f and extract the bf16 scalar again. I can add this explanation as a comment.

It's the same as for FADD / FSUB. We implement a scalar multiplication by a full element by element vector mul.

martien-de-jong · 2025-02-21T10:51:15Z

llvm/lib/Target/AIE/AIELegalizerHelper.cpp

+  const unsigned InsertEltOpc =
+      ST.getInstrInfo()->getGenericInsertVectorEltOpcode();
+
+  const Register IdxReg = MIRBuilder.buildConstant(S32, 0).getReg(0);


Wouldn't it be cheaper to broadcast? Or is this picked up by a push.lo?

andcarminati · 2025-02-21T14:42:23Z

llvm/lib/Target/AIE/aie2p/AIE2PInstrPatterns.td

@@ -222,6 +225,26 @@ def : Pat<(fadd ACC2048:$acc1, ACC2048:$acc2),
 def : Pat<(fsub ACC2048:$acc1, ACC2048:$acc2),
          (VSUB_f_vmac_cm2_add_reg ACC2048:$acc1, ACC2048:$acc2, (i32 accfp32_vecconf.ConfBits))>;

+// MUL
+def : Pat<(v64bf16 (fmul v64bf16:$vec1, v64bf16:$vec2)),


Check: We are performing the same multiplication twice: one for extract lo and other to extract hi. I guess we cannot express an optimized reuse of the same VMUL here, right?

khallouh requested review from abhinay-anubola, abnikant, andcarminati, F-Stuckmann, gbossu, katerynamuts, konstantinschwarz, martien-de-jong, niwinanto, SagarMaheshwari99 and stephenneuendorffer as code owners February 17, 2025 14:51

khallouh marked this pull request as draft February 17, 2025 14:51

khallouh force-pushed the hamza.fmul branch from a8dac06 to e795516 Compare February 17, 2025 14:51

andcarminati reviewed Feb 17, 2025

View reviewed changes

llvm/lib/Target/AIE/AIELegalizerHelper.cpp Outdated Show resolved Hide resolved

andcarminati reviewed Feb 17, 2025

View reviewed changes

llvm/lib/Target/AIE/AIELegalizerHelper.cpp Outdated Show resolved Hide resolved

khallouh force-pushed the hamza.fmul branch from e795516 to f545a2c Compare February 19, 2025 14:42

khallouh commented Feb 19, 2025

View reviewed changes

llvm/test/CodeGen/AIE/aie2p/GlobalIsel/legalize-fmul.mir Outdated Show resolved Hide resolved

khallouh force-pushed the hamza.fmul branch 2 times, most recently from 02cc673 to 872e379 Compare February 20, 2025 17:37

khallouh added 2 commits February 20, 2025 18:39

[AIE2P] Instruction select support for vector G_FMUL

d9dec55

[AIE2P] legalizer support for G_FMUL

0416372

khallouh force-pushed the hamza.fmul branch from 872e379 to 0416372 Compare February 20, 2025 17:39

khallouh marked this pull request as ready for review February 20, 2025 17:39

khallouh changed the title ~~[AIE2P] [WIP] Legalize and select VMUL.f from G_FMUL~~ [AIE2P] Legalize and select VMUL.f from G_FMUL Feb 20, 2025

martien-de-jong reviewed Feb 21, 2025

View reviewed changes

niwinanto reviewed Feb 21, 2025

View reviewed changes

martien-de-jong reviewed Feb 21, 2025

View reviewed changes

andcarminati reviewed Feb 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AIE2P] Legalize and select VMUL.f from G_FMUL #360

[AIE2P] Legalize and select VMUL.f from G_FMUL #360

khallouh commented Feb 17, 2025

martien-de-jong Feb 21, 2025

martien-de-jong Feb 21, 2025

khallouh Feb 21, 2025

martien-de-jong Feb 21, 2025

martien-de-jong Feb 21, 2025

martien-de-jong Feb 21, 2025

khallouh Feb 21, 2025

niwinanto Feb 21, 2025

khallouh Feb 21, 2025

niwinanto Feb 21, 2025

khallouh Feb 21, 2025

niwinanto Feb 21, 2025

niwinanto Feb 21, 2025

niwinanto Feb 21, 2025

khallouh Feb 21, 2025 •

edited

Loading

niwinanto Feb 21, 2025

khallouh Feb 21, 2025

martien-de-jong Feb 21, 2025

martien-de-jong Feb 21, 2025 •

edited

Loading

andcarminati Feb 21, 2025

		@@ -40,6 +40,8 @@ class VecConf {
		int BMODE_16x16_b = 1;
		int BMODE_32x16 = 0;

[AIE2P] Legalize and select VMUL.f from G_FMUL #360

Are you sure you want to change the base?

[AIE2P] Legalize and select VMUL.f from G_FMUL #360

Conversation

khallouh commented Feb 17, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

khallouh Feb 21, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martien-de-jong Feb 21, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

khallouh Feb 21, 2025 •

edited

Loading

martien-de-jong Feb 21, 2025 •

edited

Loading