-
Notifications
You must be signed in to change notification settings - Fork 14.5k
[X86] Manage atomic load of fp -> int promotion in DAG #118793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
jofrn
wants to merge
10
commits into
llvm:main
Choose a base branch
from
jofrn:atomic-scalarization.fp-x86
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
9b0a33b
[Verifier] Allow vector type in atomic load and store
jofrn aba2a03
Update Assembler/atomic test
jofrn 10b57a1
[SelectionDAG] Legalize vector types for atomic load
jofrn a95ff19
Moved test and checking mir after last X86 pass
jofrn d9d4a1c
Autogenerate test.
jofrn 78ec95b
Renamed to atomic-vector
jofrn 009b140
Add to preexisting test
jofrn f99ae71
Rename tests
jofrn eae4704
[SelectionDAG][X86] Add floating point promotion.
jofrn f04b405
Move vec1_{i32,float,half,bfloat} tests.
jofrn File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,12 +1,17 @@ | ||
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py | ||
; RUN: llc < %s -mtriple=x86_64-apple-macosx10.7.0 -verify-machineinstrs | FileCheck %s | ||
; RUN: llc < %s -mtriple=x86_64-apple-macosx10.7.0 -verify-machineinstrs -O0 | FileCheck %s | ||
; RUN: llc < %s -mtriple=x86_64-apple-macosx10.7.0 -verify-machineinstrs -O0 | FileCheck %s --check-prefix=CHECK0 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Reduce check duplication:
|
||
|
||
define void @test1(ptr %ptr, i32 %val1) { | ||
; CHECK-LABEL: test1: | ||
; CHECK: ## %bb.0: | ||
; CHECK-NEXT: xchgl %esi, (%rdi) | ||
; CHECK-NEXT: retq | ||
; | ||
; CHECK0-LABEL: test1: | ||
; CHECK0: ## %bb.0: | ||
; CHECK0-NEXT: xchgl %esi, (%rdi) | ||
; CHECK0-NEXT: retq | ||
store atomic i32 %val1, ptr %ptr seq_cst, align 4 | ||
ret void | ||
} | ||
|
@@ -16,6 +21,11 @@ define void @test2(ptr %ptr, i32 %val1) { | |
; CHECK: ## %bb.0: | ||
; CHECK-NEXT: movl %esi, (%rdi) | ||
; CHECK-NEXT: retq | ||
; | ||
; CHECK0-LABEL: test2: | ||
; CHECK0: ## %bb.0: | ||
; CHECK0-NEXT: movl %esi, (%rdi) | ||
; CHECK0-NEXT: retq | ||
store atomic i32 %val1, ptr %ptr release, align 4 | ||
ret void | ||
} | ||
|
@@ -25,6 +35,78 @@ define i32 @test3(ptr %ptr) { | |
; CHECK: ## %bb.0: | ||
; CHECK-NEXT: movl (%rdi), %eax | ||
; CHECK-NEXT: retq | ||
; | ||
; CHECK0-LABEL: test3: | ||
; CHECK0: ## %bb.0: | ||
; CHECK0-NEXT: movl (%rdi), %eax | ||
; CHECK0-NEXT: retq | ||
%val = load atomic i32, ptr %ptr seq_cst, align 4 | ||
ret i32 %val | ||
} | ||
|
||
define <1 x i32> @atomic_vec1_i32(ptr %x) { | ||
; CHECK-LABEL: atomic_vec1_i32: | ||
; CHECK: ## %bb.0: | ||
; CHECK-NEXT: movl (%rdi), %eax | ||
; CHECK-NEXT: retq | ||
; | ||
; CHECK0-LABEL: atomic_vec1_i32: | ||
; CHECK0: ## %bb.0: | ||
; CHECK0-NEXT: movl (%rdi), %eax | ||
; CHECK0-NEXT: retq | ||
%ret = load atomic <1 x i32>, ptr %x acquire, align 4 | ||
ret <1 x i32> %ret | ||
} | ||
|
||
define <1 x half> @atomic_vec1_half(ptr %x) { | ||
; CHECK-LABEL: atomic_vec1_half: | ||
; CHECK: ## %bb.0: | ||
; CHECK-NEXT: movzwl (%rdi), %eax | ||
; CHECK-NEXT: pinsrw $0, %eax, %xmm0 | ||
; CHECK-NEXT: retq | ||
; | ||
; CHECK0-LABEL: atomic_vec1_half: | ||
; CHECK0: ## %bb.0: | ||
; CHECK0-NEXT: movw (%rdi), %cx | ||
; CHECK0-NEXT: ## implicit-def: $eax | ||
; CHECK0-NEXT: movw %cx, %ax | ||
; CHECK0-NEXT: ## implicit-def: $xmm0 | ||
; CHECK0-NEXT: pinsrw $0, %eax, %xmm0 | ||
; CHECK0-NEXT: retq | ||
%ret = load atomic <1 x half>, ptr %x acquire, align 4 | ||
ret <1 x half> %ret | ||
} | ||
|
||
define <1 x float> @atomic_vec1_float(ptr %x) { | ||
; CHECK-LABEL: atomic_vec1_float: | ||
; CHECK: ## %bb.0: | ||
; CHECK-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero | ||
; CHECK-NEXT: retq | ||
; | ||
; CHECK0-LABEL: atomic_vec1_float: | ||
; CHECK0: ## %bb.0: | ||
; CHECK0-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero | ||
; CHECK0-NEXT: retq | ||
%ret = load atomic <1 x float>, ptr %x acquire, align 4 | ||
ret <1 x float> %ret | ||
} | ||
|
||
define <1 x bfloat> @atomic_vec1_bfloat(ptr %x) { | ||
; CHECK-LABEL: atomic_vec1_bfloat: | ||
; CHECK: ## %bb.0: | ||
; CHECK-NEXT: movzwl (%rdi), %eax | ||
; CHECK-NEXT: pinsrw $0, %eax, %xmm0 | ||
; CHECK-NEXT: retq | ||
; | ||
; CHECK0-LABEL: atomic_vec1_bfloat: | ||
; CHECK0: ## %bb.0: | ||
; CHECK0-NEXT: movw (%rdi), %cx | ||
; CHECK0-NEXT: ## implicit-def: $eax | ||
; CHECK0-NEXT: movw %cx, %ax | ||
; CHECK0-NEXT: ## implicit-def: $xmm0 | ||
; CHECK0-NEXT: pinsrw $0, %eax, %xmm0 | ||
; CHECK0-NEXT: retq | ||
%ret = load atomic <1 x bfloat>, ptr %x acquire, align 4 | ||
ret <1 x bfloat> %ret | ||
} | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,14 +1,15 @@ | ||
; RUN: not opt -passes=verify < %s 2>&1 | FileCheck %s | ||
; CHECK: atomic store operand must have integer, pointer, floating point, or vector type! | ||
; CHECK: atomic load operand must have integer, pointer, floating point, or vector type! | ||
|
||
; CHECK: atomic store operand must have integer, pointer, or floating point type! | ||
; CHECK: atomic load operand must have integer, pointer, or floating point type! | ||
%ty = type { i32 }; | ||
|
||
define void @foo(ptr %P, <1 x i64> %v) { | ||
store atomic <1 x i64> %v, ptr %P unordered, align 8 | ||
define void @foo(ptr %P, %ty %v) { | ||
store atomic %ty %v, ptr %P unordered, align 8 | ||
ret void | ||
} | ||
|
||
define <1 x i64> @bar(ptr %P) { | ||
%v = load atomic <1 x i64>, ptr %P unordered, align 8 | ||
ret <1 x i64> %v | ||
define %ty @bar(ptr %P) { | ||
%v = load atomic %ty, ptr %P unordered, align 8 | ||
ret %ty %v | ||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unreachable unless you touch shouldCastAtomicLoadInIR since the default will coerce this in IR
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is reachable during DAG to DAG translation. After scalarization, we promote:
v1f32,ch = AtomicLoad<(load acquire (s32) from %ir.x)> t0, t2
f32,ch = AtomicLoad<(load acquire (s32) from %ir.x)> t0, t2
// scalarizei32,ch = AtomicLoad<(load acquire (s32) from %ir.x)> t0, t2
// castThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, so this patch really just fixes the 1 x FP vector case... that's a weird edge case. So the description is now differently inaccurate. I guess it's fine to split them, but follow up should make shouldCastAtomicLoadInIR return none