Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[X86] Manage atomic load of fp -> int promotion in DAG #120386

Open
wants to merge 2 commits into
base: users/jofrn/spr/main/5c36cc8c
Choose a base branch
from

Conversation

jofrn
Copy link
Contributor

@jofrn jofrn commented Dec 18, 2024

@llvmbot
Copy link
Member

llvmbot commented Dec 18, 2024

@llvm/pr-subscribers-backend-x86

Author: None (jofrn)

Changes

When lowering atomic <1 x T> vector types with floats, selection can fail since
this pattern is unsupported. To support this, floats can be casted to
an integer type of the same size.


Stack:

  • #120387
  • #120386 ⬅
  • #120385
  • #120384

⚠️ Part of a stack created by spr. Do not merge manually using the UI - doing so may have unexpected results.


Full diff: https://github.com/llvm/llvm-project/pull/120386.diff

2 Files Affected:

  • (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+4)
  • (modified) llvm/test/CodeGen/X86/atomic-load-store.ll (+74-1)
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 2571873dba8483..8006d32d077a65 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -2595,6 +2595,10 @@ X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
         setOperationAction(Op, MVT::f32, Promote);
   }
 
+  setOperationPromotedToType(ISD::ATOMIC_LOAD, MVT::f16, MVT::i16);
+  setOperationPromotedToType(ISD::ATOMIC_LOAD, MVT::f32, MVT::i32);
+  setOperationPromotedToType(ISD::ATOMIC_LOAD, MVT::f64, MVT::i64);
+
   // We have target-specific dag combine patterns for the following nodes:
   setTargetDAGCombine({ISD::VECTOR_SHUFFLE,
                        ISD::SCALAR_TO_VECTOR,
diff --git a/llvm/test/CodeGen/X86/atomic-load-store.ll b/llvm/test/CodeGen/X86/atomic-load-store.ll
index 9cac8167542d8b..2bde0d2ffd06ad 100644
--- a/llvm/test/CodeGen/X86/atomic-load-store.ll
+++ b/llvm/test/CodeGen/X86/atomic-load-store.ll
@@ -1,12 +1,17 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc < %s -mtriple=x86_64-apple-macosx10.7.0 -verify-machineinstrs | FileCheck %s
-; RUN: llc < %s -mtriple=x86_64-apple-macosx10.7.0 -verify-machineinstrs -O0 | FileCheck %s
+; RUN: llc < %s -mtriple=x86_64-apple-macosx10.7.0 -verify-machineinstrs -O0 | FileCheck %s --check-prefix=CHECK0
 
 define void @test1(ptr %ptr, i32 %val1) {
 ; CHECK-LABEL: test1:
 ; CHECK:       ## %bb.0:
 ; CHECK-NEXT:    xchgl %esi, (%rdi)
 ; CHECK-NEXT:    retq
+;
+; CHECK0-LABEL: test1:
+; CHECK0:       ## %bb.0:
+; CHECK0-NEXT:    xchgl %esi, (%rdi)
+; CHECK0-NEXT:    retq
   store atomic i32 %val1, ptr %ptr seq_cst, align 4
   ret void
 }
@@ -16,6 +21,11 @@ define void @test2(ptr %ptr, i32 %val1) {
 ; CHECK:       ## %bb.0:
 ; CHECK-NEXT:    movl %esi, (%rdi)
 ; CHECK-NEXT:    retq
+;
+; CHECK0-LABEL: test2:
+; CHECK0:       ## %bb.0:
+; CHECK0-NEXT:    movl %esi, (%rdi)
+; CHECK0-NEXT:    retq
   store atomic i32 %val1, ptr %ptr release, align 4
   ret void
 }
@@ -25,6 +35,11 @@ define i32 @test3(ptr %ptr) {
 ; CHECK:       ## %bb.0:
 ; CHECK-NEXT:    movl (%rdi), %eax
 ; CHECK-NEXT:    retq
+;
+; CHECK0-LABEL: test3:
+; CHECK0:       ## %bb.0:
+; CHECK0-NEXT:    movl (%rdi), %eax
+; CHECK0-NEXT:    retq
   %val = load atomic i32, ptr %ptr seq_cst, align 4
   ret i32 %val
 }
@@ -34,6 +49,64 @@ define <1 x i32> @atomic_vec1_i32(ptr %x) {
 ; CHECK:       ## %bb.0:
 ; CHECK-NEXT:    movl (%rdi), %eax
 ; CHECK-NEXT:    retq
+;
+; CHECK0-LABEL: atomic_vec1_i32:
+; CHECK0:       ## %bb.0:
+; CHECK0-NEXT:    movl (%rdi), %eax
+; CHECK0-NEXT:    retq
   %ret = load atomic <1 x i32>, ptr %x acquire, align 4
   ret <1 x i32> %ret
 }
+
+define <1 x half> @atomic_vec1_half(ptr %x) {
+; CHECK-LABEL: atomic_vec1_half:
+; CHECK:       ## %bb.0:
+; CHECK-NEXT:    movzwl (%rdi), %eax
+; CHECK-NEXT:    pinsrw $0, %eax, %xmm0
+; CHECK-NEXT:    retq
+;
+; CHECK0-LABEL: atomic_vec1_half:
+; CHECK0:       ## %bb.0:
+; CHECK0-NEXT:    movw (%rdi), %cx
+; CHECK0-NEXT:    ## implicit-def: $eax
+; CHECK0-NEXT:    movw %cx, %ax
+; CHECK0-NEXT:    ## implicit-def: $xmm0
+; CHECK0-NEXT:    pinsrw $0, %eax, %xmm0
+; CHECK0-NEXT:    retq
+  %ret = load atomic <1 x half>, ptr %x acquire, align 4
+  ret <1 x half> %ret
+}
+
+define <1 x float> @atomic_vec1_float(ptr %x) {
+; CHECK-LABEL: atomic_vec1_float:
+; CHECK:       ## %bb.0:
+; CHECK-NEXT:    movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
+; CHECK-NEXT:    retq
+;
+; CHECK0-LABEL: atomic_vec1_float:
+; CHECK0:       ## %bb.0:
+; CHECK0-NEXT:    movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
+; CHECK0-NEXT:    retq
+  %ret = load atomic <1 x float>, ptr %x acquire, align 4
+  ret <1 x float> %ret
+}
+
+define <1 x bfloat> @atomic_vec1_bfloat(ptr %x) {
+; CHECK-LABEL: atomic_vec1_bfloat:
+; CHECK:       ## %bb.0:
+; CHECK-NEXT:    movzwl (%rdi), %eax
+; CHECK-NEXT:    pinsrw $0, %eax, %xmm0
+; CHECK-NEXT:    retq
+;
+; CHECK0-LABEL: atomic_vec1_bfloat:
+; CHECK0:       ## %bb.0:
+; CHECK0-NEXT:    movw (%rdi), %cx
+; CHECK0-NEXT:    ## implicit-def: $eax
+; CHECK0-NEXT:    movw %cx, %ax
+; CHECK0-NEXT:    ## implicit-def: $xmm0
+; CHECK0-NEXT:    pinsrw $0, %eax, %xmm0
+; CHECK0-NEXT:    retq
+  %ret = load atomic <1 x bfloat>, ptr %x acquire, align 4
+  ret <1 x bfloat> %ret
+}
+

@jofrn jofrn force-pushed the users/jofrn/spr/main/f9d761c5 branch from f799ee0 to 141279f Compare December 18, 2024 08:54
@jofrn jofrn changed the title [SelectionDAG][X86] Add floating point promotion. [X86] Manage atomic load of fp -> int promotion in DAG Dec 18, 2024
@jofrn jofrn force-pushed the users/jofrn/spr/main/f9d761c5 branch from 141279f to 70bb5b9 Compare December 18, 2024 11:45
@jofrn jofrn force-pushed the users/jofrn/spr/main/5c36cc8c branch from 7263545 to 5a3a12d Compare December 18, 2024 19:11
@jofrn jofrn force-pushed the users/jofrn/spr/main/f9d761c5 branch 2 times, most recently from dac7f1e to df5e28c Compare December 18, 2024 20:47
@@ -2595,6 +2595,10 @@ X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
setOperationAction(Op, MVT::f32, Promote);
}

setOperationPromotedToType(ISD::ATOMIC_LOAD, MVT::f16, MVT::i16);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably similar changes to other backends are also required?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handle bf16 as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bf16 is already lowered properly without promotion.

And yes, other backends would either have to promote these here or implement them explicitly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SelectionDAG making anything legal by default was a terrible mistake but we're not going to fix that here

@jofrn jofrn force-pushed the users/jofrn/spr/main/5c36cc8c branch from 66eca4b to 4d3fcb3 Compare December 19, 2024 02:29
@jofrn jofrn force-pushed the users/jofrn/spr/main/f9d761c5 branch 4 times, most recently from b336c25 to 7ef2576 Compare December 19, 2024 16:01
@jofrn jofrn force-pushed the users/jofrn/spr/main/5c36cc8c branch from 4d0be71 to 601c009 Compare December 19, 2024 16:01
@jofrn jofrn force-pushed the users/jofrn/spr/main/f9d761c5 branch from 7ef2576 to b4f0562 Compare December 19, 2024 16:24
@jofrn jofrn force-pushed the users/jofrn/spr/main/5c36cc8c branch from 601c009 to 3796cf7 Compare December 19, 2024 16:24
@jofrn jofrn force-pushed the users/jofrn/spr/main/5c36cc8c branch from 3796cf7 to 5f30edf Compare December 19, 2024 16:42
@jofrn jofrn force-pushed the users/jofrn/spr/main/f9d761c5 branch from b4f0562 to e0a02b6 Compare December 19, 2024 16:42
@jofrn jofrn force-pushed the users/jofrn/spr/main/5c36cc8c branch from 5f30edf to 99296f3 Compare December 19, 2024 19:15
@jofrn jofrn force-pushed the users/jofrn/spr/main/f9d761c5 branch 2 times, most recently from 40392eb to 7ca91cc Compare December 19, 2024 19:36
@jofrn jofrn force-pushed the users/jofrn/spr/main/5c36cc8c branch from 99296f3 to 245acf7 Compare December 19, 2024 19:36
@jofrn jofrn force-pushed the users/jofrn/spr/main/f9d761c5 branch from 7ca91cc to 0d6882e Compare December 19, 2024 19:43
@jofrn jofrn force-pushed the users/jofrn/spr/main/5c36cc8c branch 2 times, most recently from a3e83cb to faa0e03 Compare December 19, 2024 21:28
@jofrn jofrn force-pushed the users/jofrn/spr/main/5c36cc8c branch from 35b6ddc to 11174c5 Compare January 15, 2025 11:52
@jofrn jofrn force-pushed the users/jofrn/spr/main/f9d761c5 branch 2 times, most recently from b2541fd to 846ab2e Compare January 21, 2025 16:58
@jofrn jofrn force-pushed the users/jofrn/spr/main/5c36cc8c branch 2 times, most recently from 11b950d to ff979b4 Compare January 21, 2025 17:13
@jofrn jofrn force-pushed the users/jofrn/spr/main/f9d761c5 branch from 846ab2e to 4698589 Compare January 21, 2025 17:13
@jofrn jofrn force-pushed the users/jofrn/spr/main/5c36cc8c branch from ff979b4 to e3d8ddd Compare January 21, 2025 17:44
@jofrn jofrn force-pushed the users/jofrn/spr/main/f9d761c5 branch from 4698589 to e261f40 Compare January 21, 2025 17:44
@jofrn jofrn force-pushed the users/jofrn/spr/main/5c36cc8c branch from e3d8ddd to 02d1d2d Compare January 21, 2025 17:50
@jofrn jofrn force-pushed the users/jofrn/spr/main/f9d761c5 branch from e261f40 to 6f50ac7 Compare January 21, 2025 17:50
Comment on lines 2642 to 2644
setOperationPromotedToType(ISD::ATOMIC_LOAD, MVT::f16, MVT::i16);
setOperationPromotedToType(ISD::ATOMIC_LOAD, MVT::f32, MVT::i32);
setOperationPromotedToType(ISD::ATOMIC_LOAD, MVT::f64, MVT::i64);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't do anything without also overriding shouldCastAtomicLoadInIR. AtomicExpand will cast these, and these FP typed ATOMIC_LOADS will never form

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These will still form in from v1f16, v1f32, and v1f64, which don't get translated in AtomicExpand. It is reachable during DAG to DAG translation. After scalarization, we promote:

  1. v1f32,ch = AtomicLoad<(load acquire (s32) from %ir.x)> t0, t2
  2. f32,ch = AtomicLoad<(load acquire (s32) from %ir.x)> t0, t2 // scalarize
  3. i32,ch = AtomicLoad<(load acquire (s32) from %ir.x)> t0, t2 // cast

You mentioned earlier that a follow-up should have AtomicExpand's shouldCastAtomicLoadInIR return none for these; this doesn't affect these cases here though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, I forgot about this ugly detail

@jofrn jofrn force-pushed the users/jofrn/spr/main/f9d761c5 branch from 6f50ac7 to 8137157 Compare January 22, 2025 11:05
@jofrn jofrn force-pushed the users/jofrn/spr/main/5c36cc8c branch 2 times, most recently from 8d438cf to c7f5fea Compare January 22, 2025 11:43
@jofrn jofrn force-pushed the users/jofrn/spr/main/f9d761c5 branch from 8137157 to a495eaa Compare January 22, 2025 11:43
@jofrn jofrn force-pushed the users/jofrn/spr/main/5c36cc8c branch from c7f5fea to 9be2b4d Compare January 22, 2025 17:38
@jofrn jofrn force-pushed the users/jofrn/spr/main/f9d761c5 branch from a495eaa to ab42ec1 Compare January 22, 2025 17:38
@jofrn jofrn force-pushed the users/jofrn/spr/main/5c36cc8c branch from 9be2b4d to a1143a0 Compare January 22, 2025 17:47
@jofrn jofrn force-pushed the users/jofrn/spr/main/f9d761c5 branch 2 times, most recently from 6ac3c17 to 03a726d Compare January 22, 2025 18:19
@jofrn jofrn force-pushed the users/jofrn/spr/main/5c36cc8c branch from a1143a0 to 2a1b149 Compare January 22, 2025 18:19
@jofrn jofrn changed the base branch from users/jofrn/spr/main/5c36cc8c to main February 2, 2025 20:25
@jofrn jofrn force-pushed the users/jofrn/spr/main/f9d761c5 branch from 03a726d to f096c88 Compare February 2, 2025 20:25
@jofrn jofrn changed the base branch from main to users/jofrn/spr/main/5c36cc8c February 2, 2025 20:25
When lowering atomic <1 x T> vector types with floats, selection can fail since
this pattern is unsupported. To support this, floats can be casted to
an integer type of the same size.

commit-id:f9d761c5
@jofrn jofrn force-pushed the users/jofrn/spr/main/f9d761c5 branch from f096c88 to edd2af8 Compare March 3, 2025 23:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants