-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SelectionDAG] Legalize <1 x T> vector types for atomic load #111414
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-llvm-selectiondag Author: None (jofrn) ChangesScalarize vector of atomic load in SelectionDAG. Full diff: https://github.com/llvm/llvm-project/pull/111414.diff 3 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h b/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
index d14516ef3e2fbb..5204f20e97f63e 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
@@ -850,6 +850,7 @@ class LLVM_LIBRARY_VISIBILITY DAGTypeLegalizer {
SDValue ScalarizeVecRes_ExpOp(SDNode *N);
SDValue ScalarizeVecRes_INSERT_VECTOR_ELT(SDNode *N);
SDValue ScalarizeVecRes_LOAD(LoadSDNode *N);
+ SDValue ScalarizeVecRes_ATOMIC_LOAD(AtomicSDNode *N);
SDValue ScalarizeVecRes_SCALAR_TO_VECTOR(SDNode *N);
SDValue ScalarizeVecRes_VSELECT(SDNode *N);
SDValue ScalarizeVecRes_SELECT(SDNode *N);
diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
index 0a22f06271984e..1e911da130f8c3 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
@@ -60,6 +60,9 @@ void DAGTypeLegalizer::ScalarizeVectorResult(SDNode *N, unsigned ResNo) {
case ISD::FP_ROUND: R = ScalarizeVecRes_FP_ROUND(N); break;
case ISD::FPOWI: R = ScalarizeVecRes_ExpOp(N); break;
case ISD::INSERT_VECTOR_ELT: R = ScalarizeVecRes_INSERT_VECTOR_ELT(N); break;
+ case ISD::ATOMIC_LOAD:
+ R = ScalarizeVecRes_ATOMIC_LOAD(cast<AtomicSDNode>(N));
+ break;
case ISD::LOAD: R = ScalarizeVecRes_LOAD(cast<LoadSDNode>(N));break;
case ISD::SCALAR_TO_VECTOR: R = ScalarizeVecRes_SCALAR_TO_VECTOR(N); break;
case ISD::SIGN_EXTEND_INREG: R = ScalarizeVecRes_InregOp(N); break;
@@ -447,6 +450,19 @@ SDValue DAGTypeLegalizer::ScalarizeVecRes_INSERT_VECTOR_ELT(SDNode *N) {
return Op;
}
+SDValue DAGTypeLegalizer::ScalarizeVecRes_ATOMIC_LOAD(AtomicSDNode *N) {
+
+ SDValue Result = DAG.getAtomic(ISD::ATOMIC_LOAD, SDLoc(N),
+ N->getMemoryVT().getVectorElementType(),
+ N->getValueType(0).getVectorElementType(),
+ N->getChain(), N->getBasePtr(), N->getMemOperand());
+
+ // Legalize the chain result - switch anything that used the old chain to
+ // use the new one.
+ ReplaceValueWith(SDValue(N, 1), Result.getValue(1));
+ return Result;
+}
+
SDValue DAGTypeLegalizer::ScalarizeVecRes_LOAD(LoadSDNode *N) {
assert(N->isUnindexed() && "Indexed vector load?");
diff --git a/llvm/test/CodeGen/Generic/atomic-scalarization.ll b/llvm/test/CodeGen/Generic/atomic-scalarization.ll
new file mode 100644
index 00000000000000..3f611323172884
--- /dev/null
+++ b/llvm/test/CodeGen/Generic/atomic-scalarization.ll
@@ -0,0 +1,17 @@
+; RUN: llc %s --print-after-isel --disable-verify 2>&1 | FileCheck %s
+
+define i32 @atomic_scalar() {
+; CHECK: # After Instruction Selection:
+; CHECK-NEXT: # Machine code for function atomic_scalar: IsSSA, TracksLiveness
+; CHECK-NEXT: Frame Objects:
+; CHECK-NEXT: fi#0: size=4, align=4, at location [SP+8]
+; CHECK: bb.0 (%ir-block.0):
+; CHECK-NEXT: %0:gr32 = MOV32rm %stack.0, 1, $noreg, 0, $noreg :: (dereferenceable load acquire (s32) from %ir.1)
+; CHECK-NEXT: $eax = COPY %0:gr32
+; CHECK-NEXT: RET 0, $eax
+; CHECK: # End machine code for function atomic_scalar.
+ %1 = alloca <1 x i32>
+ %2 = load atomic <1 x i32>, ptr %1 acquire, align 4
+ %3 = extractelement <1 x i32> %2, i32 0
+ ret i32 %3
+}
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
0c84583
to
c154ab8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do stores also need to be handled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No disable-verify
Vector types on atomics are assumed to be invalid by the verifier. However, this type can be valid if it is lowered by codegen.
Scalarize vector of atomic load in SelectionDAG.
edf1c95
to
9ceb765
Compare
| Do stores also need to be handled? |
9ceb765
to
1de03a8
Compare
1de03a8
to
1fb4f56
Compare
1fb4f56
to
d9d4a1c
Compare
They do, but it can be a separate pr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
codegen parts lgtm, but should avoid manually stacking PRs (which really doesn't work). It's much harder to review, if you want to stack PRs use graphite or spr
8ebf2f9
to
78ec95b
Compare
@@ -1,12 +1,18 @@ | |||
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py | |||
; RUN: llc < %s -mtriple=x86_64-apple-macosx10.7.0 -verify-machineinstrs | FileCheck %s | |||
; RUN: llc < %s -mtriple=x86_64-apple-macosx10.7.0 -verify-machineinstrs -O0 | FileCheck %s | |||
; RUN: llc %s --mtriple=x86_64 -o - | FileCheck %s --check-prefix=CHECK0 | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a good preexisting test to add the <1 x n> cases to. It has a specific triple apple
, and when generalizing it, the difference in the output tests is not much (a #
), and so it duplicates quite a bit between x86_64
and apple
. However, I do notice that bfloat passes only when --mtriple=x86_64
, so we may want to add the additional run line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just copy the mtriple that's already here, the OS should not be significant for the purpose here. I'm surprised this also doesn't have a 32-bit run line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This patch will only cover 1 x vectors. The patch is fine as-is, but the message should change to reflect this. An additional step is required to handle any other vector size
Still only handles 1 x vectors |
; CHECK-NEXT: retq | ||
%ret = load atomic <1 x i32>, ptr %x acquire, align 4 | ||
ret <1 x i32> %ret | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add atomic load extension test coverage?
ret <1 x i64> %v | ||
define %ty @bar(ptr %P) { | ||
%v = load atomic %ty, ptr %P unordered, align 8 | ||
ret %ty %v | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please can you add <1 x i64> load/store test coverage some place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
define <1 x i64> @atomic_vec1_i64(ptr %x) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also pointers (including the 32-bit addrspace on x86-64 case which is some high number)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DAG change LGTM on top of the additional tests in the other PR
load atomic <1 x T>
is not valid. This change legalizes vector types of atomic load via scalarization in SelectionDAG so that it can, for example, translate fromv1i32
toi32
.