Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SelectionDAG][X86] Widen <2 x T> vector types for atomic load #120598

Open
wants to merge 1 commit into
base: users/jofrn/spr/main/a06a5cc6
Choose a base branch
from

Conversation

jofrn
Copy link
Contributor

@jofrn jofrn commented Dec 19, 2024

Vector types of 2 elements must be widened. This change does this
for vector types of atomic load in SelectionDAG
so that it can translate aligned vectors of >1 size. Also,
it also adds Pats to remove an extra MOV.


Stack:

⚠️ Part of a stack created by spr. Do not merge manually using the UI - doing so may have unexpected results.

@llvmbot
Copy link
Member

llvmbot commented Dec 19, 2024

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-llvm-selectiondag

Author: None (jofrn)

Changes

load atomic &lt;n x T&gt; is not valid. This change widens
vector types of atomic load in SelectionDAG
so that it can translate aligned vectors of >1 size.


Stack:

  • #120598 ⬅
  • #120387
  • #120386
  • #120385
  • #120384

⚠️ Part of a stack created by spr. Do not merge manually using the UI - doing so may have unexpected results.


Full diff: https://github.com/llvm/llvm-project/pull/120598.diff

3 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h (+3-1)
  • (modified) llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp (+36-3)
  • (modified) llvm/test/CodeGen/X86/atomic-load-store.ll (+10)
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h b/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
index b81c9f87cb27d7..22b7c15f8768ae 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
@@ -1046,6 +1046,7 @@ class LLVM_LIBRARY_VISIBILITY DAGTypeLegalizer {
   SDValue WidenVecRes_EXTRACT_SUBVECTOR(SDNode* N);
   SDValue WidenVecRes_INSERT_SUBVECTOR(SDNode *N);
   SDValue WidenVecRes_INSERT_VECTOR_ELT(SDNode* N);
+  SDValue WidenVecRes_ATOMIC_LOAD(AtomicSDNode* N);
   SDValue WidenVecRes_LOAD(SDNode* N);
   SDValue WidenVecRes_VP_LOAD(VPLoadSDNode *N);
   SDValue WidenVecRes_VP_STRIDED_LOAD(VPStridedLoadSDNode *N);
@@ -1129,8 +1130,9 @@ class LLVM_LIBRARY_VISIBILITY DAGTypeLegalizer {
   /// resulting wider type. It takes:
   ///   LdChain: list of chains for the load to be generated.
   ///   Ld:      load to widen
+  template <typename T>
   SDValue GenWidenVectorLoads(SmallVectorImpl<SDValue> &LdChain,
-                              LoadSDNode *LD);
+                              T *LD, bool IsAtomic = false);
 
   /// Helper function to generate a set of extension loads to load a vector with
   /// a resulting wider type. It takes:
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
index c85e4ba2cfa5a7..4dfdd22ba27869 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
@@ -4515,6 +4515,9 @@ void DAGTypeLegalizer::WidenVectorResult(SDNode *N, unsigned ResNo) {
     break;
   case ISD::EXTRACT_SUBVECTOR: Res = WidenVecRes_EXTRACT_SUBVECTOR(N); break;
   case ISD::INSERT_VECTOR_ELT: Res = WidenVecRes_INSERT_VECTOR_ELT(N); break;
+  case ISD::ATOMIC_LOAD:
+    Res = WidenVecRes_ATOMIC_LOAD(cast<AtomicSDNode>(N));
+    break;
   case ISD::LOAD:              Res = WidenVecRes_LOAD(N); break;
   case ISD::STEP_VECTOR:
   case ISD::SPLAT_VECTOR:
@@ -5901,6 +5904,30 @@ SDValue DAGTypeLegalizer::WidenVecRes_INSERT_VECTOR_ELT(SDNode *N) {
                      N->getOperand(1), N->getOperand(2));
 }
 
+SDValue DAGTypeLegalizer::WidenVecRes_ATOMIC_LOAD(AtomicSDNode *N) {
+  SmallVector<SDValue, 16> LdChain; // Chain for the series of load
+  SDValue Result = GenWidenVectorLoads(LdChain, N, true /*IsAtomic*/);
+
+  if (Result) {
+    // If we generate a single load, we can use that for the chain.  Otherwise,
+    // build a factor node to remember the multiple loads are independent and
+    // chain to that.
+    SDValue NewChain;
+    if (LdChain.size() == 1)
+      NewChain = LdChain[0];
+    else
+      NewChain = DAG.getNode(ISD::TokenFactor, SDLoc(N), MVT::Other, LdChain);
+
+    // Modified the chain - switch anything that used the old chain to use
+    // the new one.
+    ReplaceValueWith(SDValue(N, 1), NewChain);
+
+    return Result;
+  }
+
+  report_fatal_error("Unable to widen atomic vector load");
+}
+
 SDValue DAGTypeLegalizer::WidenVecRes_LOAD(SDNode *N) {
   LoadSDNode *LD = cast<LoadSDNode>(N);
   ISD::LoadExtType ExtType = LD->getExtensionType();
@@ -7699,8 +7726,9 @@ static SDValue BuildVectorFromScalar(SelectionDAG& DAG, EVT VecTy,
   return DAG.getNode(ISD::BITCAST, dl, VecTy, VecOp);
 }
 
+template <typename T>
 SDValue DAGTypeLegalizer::GenWidenVectorLoads(SmallVectorImpl<SDValue> &LdChain,
-                                              LoadSDNode *LD) {
+                                              T *LD, bool IsAtomic) {
   // The strategy assumes that we can efficiently load power-of-two widths.
   // The routine chops the vector into the largest vector loads with the same
   // element type or scalar loads and then recombines it to the widen vector
@@ -7757,8 +7785,13 @@ SDValue DAGTypeLegalizer::GenWidenVectorLoads(SmallVectorImpl<SDValue> &LdChain,
     } while (TypeSize::isKnownGT(RemainingWidth, NewVTWidth));
   }
 
-  SDValue LdOp = DAG.getLoad(*FirstVT, dl, Chain, BasePtr, LD->getPointerInfo(),
-                             LD->getOriginalAlign(), MMOFlags, AAInfo);
+  SDValue LdOp;
+  if (IsAtomic)
+    LdOp = DAG.getAtomic(ISD::ATOMIC_LOAD, dl, *FirstVT, *FirstVT, Chain,
+        BasePtr, LD->getMemOperand());
+  else
+    LdOp = DAG.getLoad(*FirstVT, dl, Chain, BasePtr, LD->getPointerInfo(),
+        LD->getOriginalAlign(), MMOFlags, AAInfo);
   LdChain.push_back(LdOp.getValue(1));
 
   // Check if we can load the element with one instruction.
diff --git a/llvm/test/CodeGen/X86/atomic-load-store.ll b/llvm/test/CodeGen/X86/atomic-load-store.ll
index a57ae767859b10..f3cc39dba95253 100644
--- a/llvm/test/CodeGen/X86/atomic-load-store.ll
+++ b/llvm/test/CodeGen/X86/atomic-load-store.ll
@@ -146,6 +146,16 @@ define <1 x i64> @atomic_vec1_i64_align(ptr %x) nounwind {
   ret <1 x i64> %ret
 }
 
+define <2 x i32> @atomic_vec2_i32_align(ptr %x) nounwind {
+; CHECK-LABEL: atomic_vec2_i32_align:
+; CHECK:       ## %bb.0:
+; CHECK-NEXT:    movq (%rdi), %rax
+; CHECK-NEXT:    movq %rax, %xmm0
+; CHECK-NEXT:    retq
+  %ret = load atomic <2 x i32>, ptr %x acquire, align 8
+  ret <2 x i32> %ret
+}
+
 define <1 x ptr> @atomic_vec1_ptr(ptr %x) nounwind {
 ; CHECK3-LABEL: atomic_vec1_ptr:
 ; CHECK3:       ## %bb.0:

Copy link

github-actions bot commented Dec 19, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@jofrn jofrn force-pushed the users/jofrn/spr/main/a06a5cc6 branch from 37e162b to f7691a8 Compare December 19, 2024 16:42
@jofrn jofrn force-pushed the users/jofrn/spr/main/2894ccd1 branch 2 times, most recently from cbaaeeb to d558091 Compare December 19, 2024 19:15
@jofrn jofrn force-pushed the users/jofrn/spr/main/a06a5cc6 branch 2 times, most recently from 34492d9 to 6952507 Compare December 19, 2024 19:36
@jofrn jofrn force-pushed the users/jofrn/spr/main/2894ccd1 branch from d558091 to 3f37dfc Compare December 19, 2024 19:36
@jofrn jofrn force-pushed the users/jofrn/spr/main/a06a5cc6 branch from 6952507 to 3cf21cf Compare December 19, 2024 19:43
@jofrn jofrn force-pushed the users/jofrn/spr/main/2894ccd1 branch from 3f37dfc to 89503f2 Compare December 19, 2024 19:43
@jofrn jofrn force-pushed the users/jofrn/spr/main/a06a5cc6 branch from 3cf21cf to a769a32 Compare December 19, 2024 21:28
@jofrn jofrn force-pushed the users/jofrn/spr/main/2894ccd1 branch from 89503f2 to b2f0b33 Compare December 19, 2024 21:28
@jofrn jofrn force-pushed the users/jofrn/spr/main/2894ccd1 branch from b2f0b33 to 6737dda Compare December 19, 2024 21:33
@jofrn jofrn force-pushed the users/jofrn/spr/main/a06a5cc6 branch 2 times, most recently from 464bb05 to ebba505 Compare December 19, 2024 21:59
@jofrn jofrn force-pushed the users/jofrn/spr/main/2894ccd1 branch from 6737dda to 2949391 Compare December 19, 2024 21:59
@jofrn jofrn force-pushed the users/jofrn/spr/main/a06a5cc6 branch from ebba505 to 5bbc421 Compare December 20, 2024 11:25
@jofrn jofrn force-pushed the users/jofrn/spr/main/2894ccd1 branch from 2949391 to 78adf01 Compare December 20, 2024 11:25
@jofrn jofrn force-pushed the users/jofrn/spr/main/a06a5cc6 branch from 5bbc421 to 746725f Compare December 20, 2024 11:39
@jofrn jofrn force-pushed the users/jofrn/spr/main/2894ccd1 branch 2 times, most recently from a8cc74f to 23c9ff2 Compare December 20, 2024 11:52
@jofrn jofrn force-pushed the users/jofrn/spr/main/a06a5cc6 branch from 746725f to 592fa22 Compare December 20, 2024 11:52
Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description says it's invalid, but it is now valid

Comment on lines 170 to 171
; CHECK-NEXT: movl (%rdi), %eax
; CHECK-NEXT: movd %eax, %xmm0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is broken, this should emit one atomic load

@jofrn jofrn changed the title [SelectionDAG] Widen <n x T> vector types for atomic load [SelectionDAG][X86] Widen <2 x T> vector types for atomic load Jan 2, 2025
@jofrn jofrn force-pushed the users/jofrn/spr/main/2894ccd1 branch from 23c9ff2 to 2c51f72 Compare January 2, 2025 19:21
@jofrn jofrn force-pushed the users/jofrn/spr/main/a06a5cc6 branch 2 times, most recently from 64e813b to e1008c7 Compare January 2, 2025 20:45
@jofrn jofrn force-pushed the users/jofrn/spr/main/2894ccd1 branch from 2c51f72 to 3a82883 Compare January 2, 2025 20:45
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp Outdated Show resolved Hide resolved
Comment on lines +5916 to +5925
if (LdChain.size() == 1)
NewChain = LdChain[0];
else
NewChain = DAG.getNode(ISD::TokenFactor, SDLoc(N), MVT::Other, LdChain);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think getNode will handle the 1 entry case for you

@@ -52242,6 +52243,81 @@ static SDValue combineConstantPoolLoads(SDNode *N, const SDLoc &dl,
return SDValue();
}

static MVT getScalarTypeFromVectorType(MVT VT) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just getIntegerVT(DAG.getContext(), VT.getSizeInBits())

llvm_unreachable("Invalid VT for scalar type translation");
}

static SDValue combineAtomicLoad(SDNode *N, SelectionDAG &DAG,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think any of this custom handling in the target should be necessary

; CHECK-NEXT: retq
%ret = load atomic <2 x i16>, ptr %x acquire, align 4
ret <2 x i16> %ret
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test 2 x half, 2 x bfloat, 4 x 16-bit elts

llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h Outdated Show resolved Hide resolved
@jofrn jofrn force-pushed the users/jofrn/spr/main/2894ccd1 branch from 3a82883 to 36161df Compare January 6, 2025 19:25
@jofrn jofrn force-pushed the users/jofrn/spr/main/a06a5cc6 branch from e1008c7 to 51cb67f Compare January 6, 2025 19:25
@jofrn jofrn force-pushed the users/jofrn/spr/main/2894ccd1 branch from 36161df to f0d0f17 Compare January 6, 2025 20:24
@jofrn jofrn force-pushed the users/jofrn/spr/main/a06a5cc6 branch 2 times, most recently from 41ef993 to d528b89 Compare January 7, 2025 15:31
@jofrn jofrn force-pushed the users/jofrn/spr/main/2894ccd1 branch from f0d0f17 to a96fdf1 Compare January 7, 2025 15:31
@jofrn jofrn force-pushed the users/jofrn/spr/main/a06a5cc6 branch from d528b89 to 84a1672 Compare January 15, 2025 11:52
@jofrn jofrn force-pushed the users/jofrn/spr/main/2894ccd1 branch from a96fdf1 to e07e225 Compare January 15, 2025 11:52
Vector types of 2 elements must be widened. This change does this
for vector types of atomic load in SelectionDAG
so that it can translate aligned vectors of >1 size. Also,
it also adds Pats to remove an extra MOV.

commit-id:2894ccd1
@jofrn jofrn force-pushed the users/jofrn/spr/main/2894ccd1 branch from e07e225 to 04f0fd9 Compare January 15, 2025 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants