-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[IR] Don't set strictfp on irrelevant calls #122735
base: main
Are you sure you want to change the base?
Conversation
RBuilder incorrectry assigned the attribute "strictfp" to any intrinsic function call in some cases inside strictfp functions. Accordinf to the documentation in https://llvm.org/docs/LangRef.html#function-attributes: ``` This attribute indicates that the function was called from a scope that requires strict floating-point semantics. LLVM will not attempt any optimizations that require assumptions about the floating-point rounding mode or that might alter the state of floating-point status flags that might otherwise be set or cleared by calling this function. ``` Intrinsic functions like `llvm.fabs` do not depend on rounding mode, they also never raise any FP exception, so these restrictions make no sense for them. If a function interacts with FP environment, it must be either a constrained intrinsic or an external function.
@llvm/pr-subscribers-llvm-support @llvm/pr-subscribers-backend-amdgpu Author: Serge Pavlov (spavloff) ChangesRBuilder incorrectry assigned the attribute "strictfp" to any intrinsic function call in some cases inside strictfp functions. Accordinf to the documentation in https://llvm.org/docs/LangRef.html#function-attributes:
Intrinsic functions like Patch is 289.17 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/122735.diff 16 Files Affected:
diff --git a/clang/test/CodeGen/SystemZ/strictfp_builtins.c b/clang/test/CodeGen/SystemZ/strictfp_builtins.c
index 8c8f1f4cabd742..b60fd932d31c3f 100644
--- a/clang/test/CodeGen/SystemZ/strictfp_builtins.c
+++ b/clang/test/CodeGen/SystemZ/strictfp_builtins.c
@@ -9,7 +9,7 @@
// CHECK-NEXT: [[F_ADDR:%.*]] = alloca float, align 4
// CHECK-NEXT: store float [[F:%.*]], ptr [[F_ADDR]], align 4
// CHECK-NEXT: [[TMP0:%.*]] = load float, ptr [[F_ADDR]], align 4
-// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f32(float [[TMP0]], i64 15) #[[ATTR2:[0-9]+]]
+// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f32(float [[TMP0]], i64 15)
// CHECK-NEXT: ret i32 [[TMP1]]
//
int test_isnan_float(float f) {
@@ -21,7 +21,7 @@ int test_isnan_float(float f) {
// CHECK-NEXT: [[D_ADDR:%.*]] = alloca double, align 8
// CHECK-NEXT: store double [[D:%.*]], ptr [[D_ADDR]], align 8
// CHECK-NEXT: [[TMP0:%.*]] = load double, ptr [[D_ADDR]], align 8
-// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f64(double [[TMP0]], i64 15) #[[ATTR2]]
+// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f64(double [[TMP0]], i64 15)
// CHECK-NEXT: ret i32 [[TMP1]]
//
int test_isnan_double(double d) {
@@ -34,7 +34,7 @@ int test_isnan_double(double d) {
// CHECK-NEXT: [[LD:%.*]] = load fp128, ptr [[TMP0:%.*]], align 8
// CHECK-NEXT: store fp128 [[LD]], ptr [[LD_ADDR]], align 8
// CHECK-NEXT: [[TMP1:%.*]] = load fp128, ptr [[LD_ADDR]], align 8
-// CHECK-NEXT: [[TMP2:%.*]] = call i32 @llvm.s390.tdc.f128(fp128 [[TMP1]], i64 15) #[[ATTR2]]
+// CHECK-NEXT: [[TMP2:%.*]] = call i32 @llvm.s390.tdc.f128(fp128 [[TMP1]], i64 15)
// CHECK-NEXT: ret i32 [[TMP2]]
//
int test_isnan_long_double(long double ld) {
@@ -46,7 +46,7 @@ int test_isnan_long_double(long double ld) {
// CHECK-NEXT: [[F_ADDR:%.*]] = alloca float, align 4
// CHECK-NEXT: store float [[F:%.*]], ptr [[F_ADDR]], align 4
// CHECK-NEXT: [[TMP0:%.*]] = load float, ptr [[F_ADDR]], align 4
-// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f32(float [[TMP0]], i64 48) #[[ATTR2]]
+// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f32(float [[TMP0]], i64 48)
// CHECK-NEXT: ret i32 [[TMP1]]
//
int test_isinf_float(float f) {
@@ -58,7 +58,7 @@ int test_isinf_float(float f) {
// CHECK-NEXT: [[D_ADDR:%.*]] = alloca double, align 8
// CHECK-NEXT: store double [[D:%.*]], ptr [[D_ADDR]], align 8
// CHECK-NEXT: [[TMP0:%.*]] = load double, ptr [[D_ADDR]], align 8
-// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f64(double [[TMP0]], i64 48) #[[ATTR2]]
+// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f64(double [[TMP0]], i64 48)
// CHECK-NEXT: ret i32 [[TMP1]]
//
int test_isinf_double(double d) {
@@ -71,7 +71,7 @@ int test_isinf_double(double d) {
// CHECK-NEXT: [[LD:%.*]] = load fp128, ptr [[TMP0:%.*]], align 8
// CHECK-NEXT: store fp128 [[LD]], ptr [[LD_ADDR]], align 8
// CHECK-NEXT: [[TMP1:%.*]] = load fp128, ptr [[LD_ADDR]], align 8
-// CHECK-NEXT: [[TMP2:%.*]] = call i32 @llvm.s390.tdc.f128(fp128 [[TMP1]], i64 48) #[[ATTR2]]
+// CHECK-NEXT: [[TMP2:%.*]] = call i32 @llvm.s390.tdc.f128(fp128 [[TMP1]], i64 48)
// CHECK-NEXT: ret i32 [[TMP2]]
//
int test_isinf_long_double(long double ld) {
@@ -83,7 +83,7 @@ int test_isinf_long_double(long double ld) {
// CHECK-NEXT: [[F_ADDR:%.*]] = alloca float, align 4
// CHECK-NEXT: store float [[F:%.*]], ptr [[F_ADDR]], align 4
// CHECK-NEXT: [[TMP0:%.*]] = load float, ptr [[F_ADDR]], align 4
-// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f32(float [[TMP0]], i64 4032) #[[ATTR2]]
+// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f32(float [[TMP0]], i64 4032)
// CHECK-NEXT: ret i32 [[TMP1]]
//
int test_isfinite_float(float f) {
@@ -95,7 +95,7 @@ int test_isfinite_float(float f) {
// CHECK-NEXT: [[D_ADDR:%.*]] = alloca double, align 8
// CHECK-NEXT: store double [[D:%.*]], ptr [[D_ADDR]], align 8
// CHECK-NEXT: [[TMP0:%.*]] = load double, ptr [[D_ADDR]], align 8
-// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f64(double [[TMP0]], i64 4032) #[[ATTR2]]
+// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f64(double [[TMP0]], i64 4032)
// CHECK-NEXT: ret i32 [[TMP1]]
//
int test_isfinite_double(double d) {
@@ -108,7 +108,7 @@ int test_isfinite_double(double d) {
// CHECK-NEXT: [[LD:%.*]] = load fp128, ptr [[TMP0:%.*]], align 8
// CHECK-NEXT: store fp128 [[LD]], ptr [[LD_ADDR]], align 8
// CHECK-NEXT: [[TMP1:%.*]] = load fp128, ptr [[LD_ADDR]], align 8
-// CHECK-NEXT: [[TMP2:%.*]] = call i32 @llvm.s390.tdc.f128(fp128 [[TMP1]], i64 4032) #[[ATTR2]]
+// CHECK-NEXT: [[TMP2:%.*]] = call i32 @llvm.s390.tdc.f128(fp128 [[TMP1]], i64 4032)
// CHECK-NEXT: ret i32 [[TMP2]]
//
int test_isfinite_long_double(long double ld) {
diff --git a/clang/test/CodeGen/X86/strictfp_builtins.c b/clang/test/CodeGen/X86/strictfp_builtins.c
index 43e4060bef259b..c4a22dc5ee90fd 100644
--- a/clang/test/CodeGen/X86/strictfp_builtins.c
+++ b/clang/test/CodeGen/X86/strictfp_builtins.c
@@ -27,7 +27,7 @@ void p(char *str, int x) {
// CHECK-NEXT: [[LD_ADDR:%.*]] = alloca x86_fp80, align 16
// CHECK-NEXT: store x86_fp80 [[LD:%.*]], ptr [[LD_ADDR]], align 16
// CHECK-NEXT: [[TMP0:%.*]] = load x86_fp80, ptr [[LD_ADDR]], align 16
-// CHECK-NEXT: [[TMP1:%.*]] = call i1 @llvm.is.fpclass.f80(x86_fp80 [[TMP0]], i32 516) #[[ATTR3]]
+// CHECK-NEXT: [[TMP1:%.*]] = call i1 @llvm.is.fpclass.f80(x86_fp80 [[TMP0]], i32 516)
// CHECK-NEXT: [[TMP2:%.*]] = zext i1 [[TMP1]] to i32
// CHECK-NEXT: call void @p(ptr noundef @.str.1, i32 noundef [[TMP2]]) #[[ATTR3]]
// CHECK-NEXT: ret void
@@ -43,7 +43,7 @@ void test_long_double_isinf(long double ld) {
// CHECK-NEXT: [[LD_ADDR:%.*]] = alloca x86_fp80, align 16
// CHECK-NEXT: store x86_fp80 [[LD:%.*]], ptr [[LD_ADDR]], align 16
// CHECK-NEXT: [[TMP0:%.*]] = load x86_fp80, ptr [[LD_ADDR]], align 16
-// CHECK-NEXT: [[TMP1:%.*]] = call i1 @llvm.is.fpclass.f80(x86_fp80 [[TMP0]], i32 504) #[[ATTR3]]
+// CHECK-NEXT: [[TMP1:%.*]] = call i1 @llvm.is.fpclass.f80(x86_fp80 [[TMP0]], i32 504)
// CHECK-NEXT: [[TMP2:%.*]] = zext i1 [[TMP1]] to i32
// CHECK-NEXT: call void @p(ptr noundef @.str.2, i32 noundef [[TMP2]]) #[[ATTR3]]
// CHECK-NEXT: ret void
@@ -59,7 +59,7 @@ void test_long_double_isfinite(long double ld) {
// CHECK-NEXT: [[LD_ADDR:%.*]] = alloca x86_fp80, align 16
// CHECK-NEXT: store x86_fp80 [[LD:%.*]], ptr [[LD_ADDR]], align 16
// CHECK-NEXT: [[TMP0:%.*]] = load x86_fp80, ptr [[LD_ADDR]], align 16
-// CHECK-NEXT: [[TMP1:%.*]] = call i1 @llvm.is.fpclass.f80(x86_fp80 [[TMP0]], i32 3) #[[ATTR3]]
+// CHECK-NEXT: [[TMP1:%.*]] = call i1 @llvm.is.fpclass.f80(x86_fp80 [[TMP0]], i32 3)
// CHECK-NEXT: [[TMP2:%.*]] = zext i1 [[TMP1]] to i32
// CHECK-NEXT: call void @p(ptr noundef @.str.3, i32 noundef [[TMP2]]) #[[ATTR3]]
// CHECK-NEXT: ret void
diff --git a/clang/test/CodeGen/cx-complex-range.c b/clang/test/CodeGen/cx-complex-range.c
index 88300041061aae..98735c4671ce52 100644
--- a/clang/test/CodeGen/cx-complex-range.c
+++ b/clang/test/CodeGen/cx-complex-range.c
@@ -1575,8 +1575,8 @@ _Complex float mulf(_Complex float a, _Complex float b) {
// X86WINPRMTD_STRICT-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8
// X86WINPRMTD_STRICT-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds nuw { double, double }, ptr [[B]], i32 0, i32 1
// X86WINPRMTD_STRICT-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8
-// X86WINPRMTD_STRICT-NEXT: [[TMP0:%.*]] = call double @llvm.fabs.f64(double [[B_REAL]]) #[[ATTR3]]
-// X86WINPRMTD_STRICT-NEXT: [[TMP1:%.*]] = call double @llvm.fabs.f64(double [[B_IMAG]]) #[[ATTR3]]
+// X86WINPRMTD_STRICT-NEXT: [[TMP0:%.*]] = call double @llvm.fabs.f64(double [[B_REAL]])
+// X86WINPRMTD_STRICT-NEXT: [[TMP1:%.*]] = call double @llvm.fabs.f64(double [[B_IMAG]])
// X86WINPRMTD_STRICT-NEXT: [[ABS_CMP:%.*]] = call i1 @llvm.experimental.constrained.fcmp.f64(double [[TMP0]], double [[TMP1]], metadata !"ugt", metadata !"fpexcept.strict") #[[ATTR3]]
// X86WINPRMTD_STRICT-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]]
// X86WINPRMTD_STRICT: abs_rhsr_greater_or_equal_abs_rhsi:
@@ -2658,8 +2658,8 @@ _Complex double muld(_Complex double a, _Complex double b) {
// X86WINPRMTD_STRICT-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8
// X86WINPRMTD_STRICT-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds nuw { double, double }, ptr [[B]], i32 0, i32 1
// X86WINPRMTD_STRICT-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8
-// X86WINPRMTD_STRICT-NEXT: [[TMP0:%.*]] = call double @llvm.fabs.f64(double [[B_REAL]]) #[[ATTR3]]
-// X86WINPRMTD_STRICT-NEXT: [[TMP1:%.*]] = call double @llvm.fabs.f64(double [[B_IMAG]]) #[[ATTR3]]
+// X86WINPRMTD_STRICT-NEXT: [[TMP0:%.*]] = call double @llvm.fabs.f64(double [[B_REAL]])
+// X86WINPRMTD_STRICT-NEXT: [[TMP1:%.*]] = call double @llvm.fabs.f64(double [[B_IMAG]])
// X86WINPRMTD_STRICT-NEXT: [[ABS_CMP:%.*]] = call i1 @llvm.experimental.constrained.fcmp.f64(double [[TMP0]], double [[TMP1]], metadata !"ugt", metadata !"fpexcept.strict") #[[ATTR3]]
// X86WINPRMTD_STRICT-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]]
// X86WINPRMTD_STRICT: abs_rhsr_greater_or_equal_abs_rhsi:
@@ -2713,8 +2713,8 @@ _Complex double muld(_Complex double a, _Complex double b) {
// PRMTD_STRICT-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16
// PRMTD_STRICT-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds nuw { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1
// PRMTD_STRICT-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16
-// PRMTD_STRICT-NEXT: [[TMP0:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[B_REAL]]) #[[ATTR4]]
-// PRMTD_STRICT-NEXT: [[TMP1:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[B_IMAG]]) #[[ATTR4]]
+// PRMTD_STRICT-NEXT: [[TMP0:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[B_REAL]])
+// PRMTD_STRICT-NEXT: [[TMP1:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[B_IMAG]])
// PRMTD_STRICT-NEXT: [[ABS_CMP:%.*]] = call i1 @llvm.experimental.constrained.fcmp.f80(x86_fp80 [[TMP0]], x86_fp80 [[TMP1]], metadata !"ugt", metadata !"fpexcept.strict") #[[ATTR4]]
// PRMTD_STRICT-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]]
// PRMTD_STRICT: abs_rhsr_greater_or_equal_abs_rhsi:
@@ -3961,8 +3961,8 @@ _Complex long double mulld(_Complex long double a, _Complex long double b) {
// X86WINPRMTD_STRICT-NEXT: [[C_IMAG:%.*]] = load float, ptr [[C_IMAGP]], align 4
// X86WINPRMTD_STRICT-NEXT: [[CONV:%.*]] = call double @llvm.experimental.constrained.fpext.f64.f32(float [[C_REAL]], metadata !"fpexcept.strict") #[[ATTR3]]
// X86WINPRMTD_STRICT-NEXT: [[CONV1:%.*]] = call double @llvm.experimental.constrained.fpext.f64.f32(float [[C_IMAG]], metadata !"fpexcept.strict") #[[ATTR3]]
-// X86WINPRMTD_STRICT-NEXT: [[TMP0:%.*]] = call double @llvm.fabs.f64(double [[CONV]]) #[[ATTR3]]
-// X86WINPRMTD_STRICT-NEXT: [[TMP1:%.*]] = call double @llvm.fabs.f64(double [[CONV1]]) #[[ATTR3]]
+// X86WINPRMTD_STRICT-NEXT: [[TMP0:%.*]] = call double @llvm.fabs.f64(double [[CONV]])
+// X86WINPRMTD_STRICT-NEXT: [[TMP1:%.*]] = call double @llvm.fabs.f64(double [[CONV1]])
// X86WINPRMTD_STRICT-NEXT: [[ABS_CMP:%.*]] = call i1 @llvm.experimental.constrained.fcmp.f64(double [[TMP0]], double [[TMP1]], metadata !"ugt", metadata !"fpexcept.strict") #[[ATTR3]]
// X86WINPRMTD_STRICT-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]]
// X86WINPRMTD_STRICT: abs_rhsr_greater_or_equal_abs_rhsi:
@@ -4038,8 +4038,8 @@ _Complex long double mulld(_Complex long double a, _Complex long double b) {
// PRMTD_STRICT-NEXT: [[C_IMAG:%.*]] = load float, ptr [[C_IMAGP]], align 4
// PRMTD_STRICT-NEXT: [[CONV:%.*]] = call x86_fp80 @llvm.experimental.constrained.fpext.f80.f32(float [[C_REAL]], metadata !"fpexcept.strict") #[[ATTR4]]
// PRMTD_STRICT-NEXT: [[CONV1:%.*]] = call x86_fp80 @llvm.experimental.constrained.fpext.f80.f32(float [[C_IMAG]], metadata !"fpexcept.strict") #[[ATTR4]]
-// PRMTD_STRICT-NEXT: [[TMP0:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV]]) #[[ATTR4]]
-// PRMTD_STRICT-NEXT: [[TMP1:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV1]]) #[[ATTR4]]
+// PRMTD_STRICT-NEXT: [[TMP0:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV]])
+// PRMTD_STRICT-NEXT: [[TMP1:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV1]])
// PRMTD_STRICT-NEXT: [[ABS_CMP:%.*]] = call i1 @llvm.experimental.constrained.fcmp.f80(x86_fp80 [[TMP0]], x86_fp80 [[TMP1]], metadata !"ugt", metadata !"fpexcept.strict") #[[ATTR4]]
// PRMTD_STRICT-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]]
// PRMTD_STRICT: abs_rhsr_greater_or_equal_abs_rhsi:
diff --git a/clang/test/CodeGen/isfpclass.c b/clang/test/CodeGen/isfpclass.c
index 1bf60b8fbca176..58849eec340c67 100644
--- a/clang/test/CodeGen/isfpclass.c
+++ b/clang/test/CodeGen/isfpclass.c
@@ -15,7 +15,7 @@ _Bool check_isfpclass_finite(float x) {
// CHECK-LABEL: define dso_local noundef i1 @check_isfpclass_finite_strict
// CHECK-SAME: (float noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2:[0-9]+]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 504) #[[ATTR5:[0-9]+]]
+// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 504)
// CHECK-NEXT: ret i1 [[TMP0]]
//
_Bool check_isfpclass_finite_strict(float x) {
@@ -36,7 +36,7 @@ _Bool check_isfpclass_nan_f32(float x) {
// CHECK-LABEL: define dso_local noundef i1 @check_isfpclass_nan_f32_strict
// CHECK-SAME: (float noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 3) #[[ATTR5]]
+// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 3)
// CHECK-NEXT: ret i1 [[TMP0]]
//
_Bool check_isfpclass_nan_f32_strict(float x) {
@@ -57,7 +57,7 @@ _Bool check_isfpclass_snan_f64(double x) {
// CHECK-LABEL: define dso_local noundef i1 @check_isfpclass_snan_f64_strict
// CHECK-SAME: (double noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f64(double [[X]], i32 1) #[[ATTR5]]
+// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f64(double [[X]], i32 1)
// CHECK-NEXT: ret i1 [[TMP0]]
//
_Bool check_isfpclass_snan_f64_strict(double x) {
@@ -78,7 +78,7 @@ _Bool check_isfpclass_zero_f16(_Float16 x) {
// CHECK-LABEL: define dso_local noundef i1 @check_isfpclass_zero_f16_strict
// CHECK-SAME: (half noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f16(half [[X]], i32 96) #[[ATTR5]]
+// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f16(half [[X]], i32 96)
// CHECK-NEXT: ret i1 [[TMP0]]
//
_Bool check_isfpclass_zero_f16_strict(_Float16 x) {
@@ -89,7 +89,7 @@ _Bool check_isfpclass_zero_f16_strict(_Float16 x) {
// CHECK-LABEL: define dso_local noundef i1 @check_isnan
// CHECK-SAME: (float noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 3) #[[ATTR5]]
+// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 3)
// CHECK-NEXT: ret i1 [[TMP0]]
//
_Bool check_isnan(float x) {
@@ -100,7 +100,7 @@ _Bool check_isnan(float x) {
// CHECK-LABEL: define dso_local noundef i1 @check_isinf
// CHECK-SAME: (float noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 516) #[[ATTR5]]
+// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 516)
// CHECK-NEXT: ret i1 [[TMP0]]
//
_Bool check_isinf(float x) {
@@ -111,7 +111,7 @@ _Bool check_isinf(float x) {
// CHECK-LABEL: define dso_local noundef i1 @check_isfinite
// CHECK-SAME: (float noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 504) #[[ATTR5]]
+// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 504)
// CHECK-NEXT: ret i1 [[TMP0]]
//
_Bool check_isfinite(float x) {
@@ -122,7 +122,7 @@ _Bool check_isfinite(float x) {
// CHECK-LABEL: define dso_local noundef i1 @check_isnormal
// CHECK-SAME: (float noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 264) #[[ATTR5]]
+// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 264)
// CHECK-NEXT: ret i1 [[TMP0]]
//
_Bool check_isnormal(float x) {
@@ -150,7 +150,7 @@ int4 check_isfpclass_nan_v4f32(float4 x) {
// CHECK-LABEL: define dso_local range(i32 0, 2) <4 x i32> @check_isfpclass_nan_strict_v4f32
// CHECK-SAME: (<4 x float> noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[TMP0:%.*]] = tail call <4 x i1> @llvm.is.fpclass.v4f32(<4 x float> [[X]], i32 3) #[[ATTR5]]
+// CHECK-NEXT: [[TMP0:%.*]] = tail call <4 x i1> @llvm.is.fpclass.v4f32(<4 x float> [[X]], i32 3)
// CHECK-NEXT: [[TMP1:%.*]] = zext <4 x i1> [[TMP0]] to <4 x i32>
// CHECK-NEXT: ret <4 x i32> [[TMP1]]
//
diff --git a/clang/test/CodeGen/strictfp-elementwise-bulitins.cpp b/clang/test/CodeGen/strictfp-elementwise-bulitins.cpp
index 175ad22601839d..9ba1a557a90fee 100644
--- a/clang/test/CodeGen/strictfp-elementwise-bulitins.cpp
+++ b/clang/test/CodeGen/strictfp-elementwise-bulitins.cpp
@@ -20,7 +20,7 @@ float4 strict_fadd(float4 a, float4 b) {
// CHECK-LABEL: define dso_local noundef <4 x float> @_Z22strict_elementwise_absDv4_f
// CHECK-SAME: (<4 x float> noundef [[A:%.*]]) local_unnamed_addr #[[ATTR2:[0-9]+]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[ELT_ABS:%.*]] = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> [[A]]) #[[ATTR4]]
+// CHECK-NEXT: [[ELT_ABS:%.*]] = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> [[A]])
// CHECK-NEXT: ret <4 x float> [[ELT_ABS]]
//
float4 strict_elementwise_abs(float4 a) {
@@ -50,7 +50,7 @@ float4 strict_elementwise_min(float4 a, float4 b) {
// CHECK-LABEL: define dso_local noundef <4 x float> @_Z26strict_elementwise_maximumDv4_fS_
// CHECK-SAME: (<4 x float> noundef [[A:%.*]], <4 x float> noundef [[B:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[ELT_MAXIMUM:%.*]] = tail call <4 x float> @llvm.maximum.v4f32(<4 x float> [[A]], <4 x float> [[B]]) #[[ATTR4]]
+// CHECK-NEXT: [[ELT_MAXIMUM:%.*]] = tail call <4 x float> @llvm.maximum.v4f32(<4 x float> [[A]], <4 x float> [[B]])
// CHECK-NEXT: ret <4 x float> [[ELT_MAXIMUM]]
//
float4 strict_elementwise_maximum(float4 a, float4 b) {
@@ -60,7 +60,7 @@ float4 strict_elementwise_maximum(float4 a, float4 b) {
// CHECK-LABEL: define dso_local noundef <4 x float> @_Z26strict_elementwise_minimumDv4_fS_
// CHECK-SAME: (<4 x float> noundef [[A:%.*]], <4 x float> noundef [[B:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[ELT_MINIMUM:%.*]] = tail call <4 x float> @llvm.minimum.v4f32(<4 x float> [[A]], <4 x float> [[B]]) #[[ATTR4]]
+// CHECK-NEXT: [[ELT_MINIMUM:%.*]] = tail call <4 x float> @llvm.minimum.v4f32(<4 x float> [[A]], <4 x float> [[B]])
// CHECK-NEXT: ret <4 x float> [[ELT_MINIMUM]]
//
float4 strict_elementwise_minimum(float4 a, float4 b) {
@@ -70,7 +70,7 @@ float4 strict_elementwise_minimum(float4 a, float4 b) {
// CHECK-LABEL: define dso_local noundef <4 x float> @_Z23strict_elementwise_ceilDv4_f
// CHECK-SAME: (<4 x float> noundef [[A:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT:...
[truncated]
|
@llvm/pr-subscribers-clang Author: Serge Pavlov (spavloff) ChangesRBuilder incorrectry assigned the attribute "strictfp" to any intrinsic function call in some cases inside strictfp functions. Accordinf to the documentation in https://llvm.org/docs/LangRef.html#function-attributes:
Intrinsic functions like Patch is 289.17 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/122735.diff 16 Files Affected:
diff --git a/clang/test/CodeGen/SystemZ/strictfp_builtins.c b/clang/test/CodeGen/SystemZ/strictfp_builtins.c
index 8c8f1f4cabd742..b60fd932d31c3f 100644
--- a/clang/test/CodeGen/SystemZ/strictfp_builtins.c
+++ b/clang/test/CodeGen/SystemZ/strictfp_builtins.c
@@ -9,7 +9,7 @@
// CHECK-NEXT: [[F_ADDR:%.*]] = alloca float, align 4
// CHECK-NEXT: store float [[F:%.*]], ptr [[F_ADDR]], align 4
// CHECK-NEXT: [[TMP0:%.*]] = load float, ptr [[F_ADDR]], align 4
-// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f32(float [[TMP0]], i64 15) #[[ATTR2:[0-9]+]]
+// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f32(float [[TMP0]], i64 15)
// CHECK-NEXT: ret i32 [[TMP1]]
//
int test_isnan_float(float f) {
@@ -21,7 +21,7 @@ int test_isnan_float(float f) {
// CHECK-NEXT: [[D_ADDR:%.*]] = alloca double, align 8
// CHECK-NEXT: store double [[D:%.*]], ptr [[D_ADDR]], align 8
// CHECK-NEXT: [[TMP0:%.*]] = load double, ptr [[D_ADDR]], align 8
-// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f64(double [[TMP0]], i64 15) #[[ATTR2]]
+// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f64(double [[TMP0]], i64 15)
// CHECK-NEXT: ret i32 [[TMP1]]
//
int test_isnan_double(double d) {
@@ -34,7 +34,7 @@ int test_isnan_double(double d) {
// CHECK-NEXT: [[LD:%.*]] = load fp128, ptr [[TMP0:%.*]], align 8
// CHECK-NEXT: store fp128 [[LD]], ptr [[LD_ADDR]], align 8
// CHECK-NEXT: [[TMP1:%.*]] = load fp128, ptr [[LD_ADDR]], align 8
-// CHECK-NEXT: [[TMP2:%.*]] = call i32 @llvm.s390.tdc.f128(fp128 [[TMP1]], i64 15) #[[ATTR2]]
+// CHECK-NEXT: [[TMP2:%.*]] = call i32 @llvm.s390.tdc.f128(fp128 [[TMP1]], i64 15)
// CHECK-NEXT: ret i32 [[TMP2]]
//
int test_isnan_long_double(long double ld) {
@@ -46,7 +46,7 @@ int test_isnan_long_double(long double ld) {
// CHECK-NEXT: [[F_ADDR:%.*]] = alloca float, align 4
// CHECK-NEXT: store float [[F:%.*]], ptr [[F_ADDR]], align 4
// CHECK-NEXT: [[TMP0:%.*]] = load float, ptr [[F_ADDR]], align 4
-// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f32(float [[TMP0]], i64 48) #[[ATTR2]]
+// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f32(float [[TMP0]], i64 48)
// CHECK-NEXT: ret i32 [[TMP1]]
//
int test_isinf_float(float f) {
@@ -58,7 +58,7 @@ int test_isinf_float(float f) {
// CHECK-NEXT: [[D_ADDR:%.*]] = alloca double, align 8
// CHECK-NEXT: store double [[D:%.*]], ptr [[D_ADDR]], align 8
// CHECK-NEXT: [[TMP0:%.*]] = load double, ptr [[D_ADDR]], align 8
-// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f64(double [[TMP0]], i64 48) #[[ATTR2]]
+// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f64(double [[TMP0]], i64 48)
// CHECK-NEXT: ret i32 [[TMP1]]
//
int test_isinf_double(double d) {
@@ -71,7 +71,7 @@ int test_isinf_double(double d) {
// CHECK-NEXT: [[LD:%.*]] = load fp128, ptr [[TMP0:%.*]], align 8
// CHECK-NEXT: store fp128 [[LD]], ptr [[LD_ADDR]], align 8
// CHECK-NEXT: [[TMP1:%.*]] = load fp128, ptr [[LD_ADDR]], align 8
-// CHECK-NEXT: [[TMP2:%.*]] = call i32 @llvm.s390.tdc.f128(fp128 [[TMP1]], i64 48) #[[ATTR2]]
+// CHECK-NEXT: [[TMP2:%.*]] = call i32 @llvm.s390.tdc.f128(fp128 [[TMP1]], i64 48)
// CHECK-NEXT: ret i32 [[TMP2]]
//
int test_isinf_long_double(long double ld) {
@@ -83,7 +83,7 @@ int test_isinf_long_double(long double ld) {
// CHECK-NEXT: [[F_ADDR:%.*]] = alloca float, align 4
// CHECK-NEXT: store float [[F:%.*]], ptr [[F_ADDR]], align 4
// CHECK-NEXT: [[TMP0:%.*]] = load float, ptr [[F_ADDR]], align 4
-// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f32(float [[TMP0]], i64 4032) #[[ATTR2]]
+// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f32(float [[TMP0]], i64 4032)
// CHECK-NEXT: ret i32 [[TMP1]]
//
int test_isfinite_float(float f) {
@@ -95,7 +95,7 @@ int test_isfinite_float(float f) {
// CHECK-NEXT: [[D_ADDR:%.*]] = alloca double, align 8
// CHECK-NEXT: store double [[D:%.*]], ptr [[D_ADDR]], align 8
// CHECK-NEXT: [[TMP0:%.*]] = load double, ptr [[D_ADDR]], align 8
-// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f64(double [[TMP0]], i64 4032) #[[ATTR2]]
+// CHECK-NEXT: [[TMP1:%.*]] = call i32 @llvm.s390.tdc.f64(double [[TMP0]], i64 4032)
// CHECK-NEXT: ret i32 [[TMP1]]
//
int test_isfinite_double(double d) {
@@ -108,7 +108,7 @@ int test_isfinite_double(double d) {
// CHECK-NEXT: [[LD:%.*]] = load fp128, ptr [[TMP0:%.*]], align 8
// CHECK-NEXT: store fp128 [[LD]], ptr [[LD_ADDR]], align 8
// CHECK-NEXT: [[TMP1:%.*]] = load fp128, ptr [[LD_ADDR]], align 8
-// CHECK-NEXT: [[TMP2:%.*]] = call i32 @llvm.s390.tdc.f128(fp128 [[TMP1]], i64 4032) #[[ATTR2]]
+// CHECK-NEXT: [[TMP2:%.*]] = call i32 @llvm.s390.tdc.f128(fp128 [[TMP1]], i64 4032)
// CHECK-NEXT: ret i32 [[TMP2]]
//
int test_isfinite_long_double(long double ld) {
diff --git a/clang/test/CodeGen/X86/strictfp_builtins.c b/clang/test/CodeGen/X86/strictfp_builtins.c
index 43e4060bef259b..c4a22dc5ee90fd 100644
--- a/clang/test/CodeGen/X86/strictfp_builtins.c
+++ b/clang/test/CodeGen/X86/strictfp_builtins.c
@@ -27,7 +27,7 @@ void p(char *str, int x) {
// CHECK-NEXT: [[LD_ADDR:%.*]] = alloca x86_fp80, align 16
// CHECK-NEXT: store x86_fp80 [[LD:%.*]], ptr [[LD_ADDR]], align 16
// CHECK-NEXT: [[TMP0:%.*]] = load x86_fp80, ptr [[LD_ADDR]], align 16
-// CHECK-NEXT: [[TMP1:%.*]] = call i1 @llvm.is.fpclass.f80(x86_fp80 [[TMP0]], i32 516) #[[ATTR3]]
+// CHECK-NEXT: [[TMP1:%.*]] = call i1 @llvm.is.fpclass.f80(x86_fp80 [[TMP0]], i32 516)
// CHECK-NEXT: [[TMP2:%.*]] = zext i1 [[TMP1]] to i32
// CHECK-NEXT: call void @p(ptr noundef @.str.1, i32 noundef [[TMP2]]) #[[ATTR3]]
// CHECK-NEXT: ret void
@@ -43,7 +43,7 @@ void test_long_double_isinf(long double ld) {
// CHECK-NEXT: [[LD_ADDR:%.*]] = alloca x86_fp80, align 16
// CHECK-NEXT: store x86_fp80 [[LD:%.*]], ptr [[LD_ADDR]], align 16
// CHECK-NEXT: [[TMP0:%.*]] = load x86_fp80, ptr [[LD_ADDR]], align 16
-// CHECK-NEXT: [[TMP1:%.*]] = call i1 @llvm.is.fpclass.f80(x86_fp80 [[TMP0]], i32 504) #[[ATTR3]]
+// CHECK-NEXT: [[TMP1:%.*]] = call i1 @llvm.is.fpclass.f80(x86_fp80 [[TMP0]], i32 504)
// CHECK-NEXT: [[TMP2:%.*]] = zext i1 [[TMP1]] to i32
// CHECK-NEXT: call void @p(ptr noundef @.str.2, i32 noundef [[TMP2]]) #[[ATTR3]]
// CHECK-NEXT: ret void
@@ -59,7 +59,7 @@ void test_long_double_isfinite(long double ld) {
// CHECK-NEXT: [[LD_ADDR:%.*]] = alloca x86_fp80, align 16
// CHECK-NEXT: store x86_fp80 [[LD:%.*]], ptr [[LD_ADDR]], align 16
// CHECK-NEXT: [[TMP0:%.*]] = load x86_fp80, ptr [[LD_ADDR]], align 16
-// CHECK-NEXT: [[TMP1:%.*]] = call i1 @llvm.is.fpclass.f80(x86_fp80 [[TMP0]], i32 3) #[[ATTR3]]
+// CHECK-NEXT: [[TMP1:%.*]] = call i1 @llvm.is.fpclass.f80(x86_fp80 [[TMP0]], i32 3)
// CHECK-NEXT: [[TMP2:%.*]] = zext i1 [[TMP1]] to i32
// CHECK-NEXT: call void @p(ptr noundef @.str.3, i32 noundef [[TMP2]]) #[[ATTR3]]
// CHECK-NEXT: ret void
diff --git a/clang/test/CodeGen/cx-complex-range.c b/clang/test/CodeGen/cx-complex-range.c
index 88300041061aae..98735c4671ce52 100644
--- a/clang/test/CodeGen/cx-complex-range.c
+++ b/clang/test/CodeGen/cx-complex-range.c
@@ -1575,8 +1575,8 @@ _Complex float mulf(_Complex float a, _Complex float b) {
// X86WINPRMTD_STRICT-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8
// X86WINPRMTD_STRICT-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds nuw { double, double }, ptr [[B]], i32 0, i32 1
// X86WINPRMTD_STRICT-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8
-// X86WINPRMTD_STRICT-NEXT: [[TMP0:%.*]] = call double @llvm.fabs.f64(double [[B_REAL]]) #[[ATTR3]]
-// X86WINPRMTD_STRICT-NEXT: [[TMP1:%.*]] = call double @llvm.fabs.f64(double [[B_IMAG]]) #[[ATTR3]]
+// X86WINPRMTD_STRICT-NEXT: [[TMP0:%.*]] = call double @llvm.fabs.f64(double [[B_REAL]])
+// X86WINPRMTD_STRICT-NEXT: [[TMP1:%.*]] = call double @llvm.fabs.f64(double [[B_IMAG]])
// X86WINPRMTD_STRICT-NEXT: [[ABS_CMP:%.*]] = call i1 @llvm.experimental.constrained.fcmp.f64(double [[TMP0]], double [[TMP1]], metadata !"ugt", metadata !"fpexcept.strict") #[[ATTR3]]
// X86WINPRMTD_STRICT-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]]
// X86WINPRMTD_STRICT: abs_rhsr_greater_or_equal_abs_rhsi:
@@ -2658,8 +2658,8 @@ _Complex double muld(_Complex double a, _Complex double b) {
// X86WINPRMTD_STRICT-NEXT: [[B_REAL:%.*]] = load double, ptr [[B_REALP]], align 8
// X86WINPRMTD_STRICT-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds nuw { double, double }, ptr [[B]], i32 0, i32 1
// X86WINPRMTD_STRICT-NEXT: [[B_IMAG:%.*]] = load double, ptr [[B_IMAGP]], align 8
-// X86WINPRMTD_STRICT-NEXT: [[TMP0:%.*]] = call double @llvm.fabs.f64(double [[B_REAL]]) #[[ATTR3]]
-// X86WINPRMTD_STRICT-NEXT: [[TMP1:%.*]] = call double @llvm.fabs.f64(double [[B_IMAG]]) #[[ATTR3]]
+// X86WINPRMTD_STRICT-NEXT: [[TMP0:%.*]] = call double @llvm.fabs.f64(double [[B_REAL]])
+// X86WINPRMTD_STRICT-NEXT: [[TMP1:%.*]] = call double @llvm.fabs.f64(double [[B_IMAG]])
// X86WINPRMTD_STRICT-NEXT: [[ABS_CMP:%.*]] = call i1 @llvm.experimental.constrained.fcmp.f64(double [[TMP0]], double [[TMP1]], metadata !"ugt", metadata !"fpexcept.strict") #[[ATTR3]]
// X86WINPRMTD_STRICT-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]]
// X86WINPRMTD_STRICT: abs_rhsr_greater_or_equal_abs_rhsi:
@@ -2713,8 +2713,8 @@ _Complex double muld(_Complex double a, _Complex double b) {
// PRMTD_STRICT-NEXT: [[B_REAL:%.*]] = load x86_fp80, ptr [[B_REALP]], align 16
// PRMTD_STRICT-NEXT: [[B_IMAGP:%.*]] = getelementptr inbounds nuw { x86_fp80, x86_fp80 }, ptr [[B]], i32 0, i32 1
// PRMTD_STRICT-NEXT: [[B_IMAG:%.*]] = load x86_fp80, ptr [[B_IMAGP]], align 16
-// PRMTD_STRICT-NEXT: [[TMP0:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[B_REAL]]) #[[ATTR4]]
-// PRMTD_STRICT-NEXT: [[TMP1:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[B_IMAG]]) #[[ATTR4]]
+// PRMTD_STRICT-NEXT: [[TMP0:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[B_REAL]])
+// PRMTD_STRICT-NEXT: [[TMP1:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[B_IMAG]])
// PRMTD_STRICT-NEXT: [[ABS_CMP:%.*]] = call i1 @llvm.experimental.constrained.fcmp.f80(x86_fp80 [[TMP0]], x86_fp80 [[TMP1]], metadata !"ugt", metadata !"fpexcept.strict") #[[ATTR4]]
// PRMTD_STRICT-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]]
// PRMTD_STRICT: abs_rhsr_greater_or_equal_abs_rhsi:
@@ -3961,8 +3961,8 @@ _Complex long double mulld(_Complex long double a, _Complex long double b) {
// X86WINPRMTD_STRICT-NEXT: [[C_IMAG:%.*]] = load float, ptr [[C_IMAGP]], align 4
// X86WINPRMTD_STRICT-NEXT: [[CONV:%.*]] = call double @llvm.experimental.constrained.fpext.f64.f32(float [[C_REAL]], metadata !"fpexcept.strict") #[[ATTR3]]
// X86WINPRMTD_STRICT-NEXT: [[CONV1:%.*]] = call double @llvm.experimental.constrained.fpext.f64.f32(float [[C_IMAG]], metadata !"fpexcept.strict") #[[ATTR3]]
-// X86WINPRMTD_STRICT-NEXT: [[TMP0:%.*]] = call double @llvm.fabs.f64(double [[CONV]]) #[[ATTR3]]
-// X86WINPRMTD_STRICT-NEXT: [[TMP1:%.*]] = call double @llvm.fabs.f64(double [[CONV1]]) #[[ATTR3]]
+// X86WINPRMTD_STRICT-NEXT: [[TMP0:%.*]] = call double @llvm.fabs.f64(double [[CONV]])
+// X86WINPRMTD_STRICT-NEXT: [[TMP1:%.*]] = call double @llvm.fabs.f64(double [[CONV1]])
// X86WINPRMTD_STRICT-NEXT: [[ABS_CMP:%.*]] = call i1 @llvm.experimental.constrained.fcmp.f64(double [[TMP0]], double [[TMP1]], metadata !"ugt", metadata !"fpexcept.strict") #[[ATTR3]]
// X86WINPRMTD_STRICT-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]]
// X86WINPRMTD_STRICT: abs_rhsr_greater_or_equal_abs_rhsi:
@@ -4038,8 +4038,8 @@ _Complex long double mulld(_Complex long double a, _Complex long double b) {
// PRMTD_STRICT-NEXT: [[C_IMAG:%.*]] = load float, ptr [[C_IMAGP]], align 4
// PRMTD_STRICT-NEXT: [[CONV:%.*]] = call x86_fp80 @llvm.experimental.constrained.fpext.f80.f32(float [[C_REAL]], metadata !"fpexcept.strict") #[[ATTR4]]
// PRMTD_STRICT-NEXT: [[CONV1:%.*]] = call x86_fp80 @llvm.experimental.constrained.fpext.f80.f32(float [[C_IMAG]], metadata !"fpexcept.strict") #[[ATTR4]]
-// PRMTD_STRICT-NEXT: [[TMP0:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV]]) #[[ATTR4]]
-// PRMTD_STRICT-NEXT: [[TMP1:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV1]]) #[[ATTR4]]
+// PRMTD_STRICT-NEXT: [[TMP0:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV]])
+// PRMTD_STRICT-NEXT: [[TMP1:%.*]] = call x86_fp80 @llvm.fabs.f80(x86_fp80 [[CONV1]])
// PRMTD_STRICT-NEXT: [[ABS_CMP:%.*]] = call i1 @llvm.experimental.constrained.fcmp.f80(x86_fp80 [[TMP0]], x86_fp80 [[TMP1]], metadata !"ugt", metadata !"fpexcept.strict") #[[ATTR4]]
// PRMTD_STRICT-NEXT: br i1 [[ABS_CMP]], label [[ABS_RHSR_GREATER_OR_EQUAL_ABS_RHSI:%.*]], label [[ABS_RHSR_LESS_THAN_ABS_RHSI:%.*]]
// PRMTD_STRICT: abs_rhsr_greater_or_equal_abs_rhsi:
diff --git a/clang/test/CodeGen/isfpclass.c b/clang/test/CodeGen/isfpclass.c
index 1bf60b8fbca176..58849eec340c67 100644
--- a/clang/test/CodeGen/isfpclass.c
+++ b/clang/test/CodeGen/isfpclass.c
@@ -15,7 +15,7 @@ _Bool check_isfpclass_finite(float x) {
// CHECK-LABEL: define dso_local noundef i1 @check_isfpclass_finite_strict
// CHECK-SAME: (float noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2:[0-9]+]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 504) #[[ATTR5:[0-9]+]]
+// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 504)
// CHECK-NEXT: ret i1 [[TMP0]]
//
_Bool check_isfpclass_finite_strict(float x) {
@@ -36,7 +36,7 @@ _Bool check_isfpclass_nan_f32(float x) {
// CHECK-LABEL: define dso_local noundef i1 @check_isfpclass_nan_f32_strict
// CHECK-SAME: (float noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 3) #[[ATTR5]]
+// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 3)
// CHECK-NEXT: ret i1 [[TMP0]]
//
_Bool check_isfpclass_nan_f32_strict(float x) {
@@ -57,7 +57,7 @@ _Bool check_isfpclass_snan_f64(double x) {
// CHECK-LABEL: define dso_local noundef i1 @check_isfpclass_snan_f64_strict
// CHECK-SAME: (double noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f64(double [[X]], i32 1) #[[ATTR5]]
+// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f64(double [[X]], i32 1)
// CHECK-NEXT: ret i1 [[TMP0]]
//
_Bool check_isfpclass_snan_f64_strict(double x) {
@@ -78,7 +78,7 @@ _Bool check_isfpclass_zero_f16(_Float16 x) {
// CHECK-LABEL: define dso_local noundef i1 @check_isfpclass_zero_f16_strict
// CHECK-SAME: (half noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f16(half [[X]], i32 96) #[[ATTR5]]
+// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f16(half [[X]], i32 96)
// CHECK-NEXT: ret i1 [[TMP0]]
//
_Bool check_isfpclass_zero_f16_strict(_Float16 x) {
@@ -89,7 +89,7 @@ _Bool check_isfpclass_zero_f16_strict(_Float16 x) {
// CHECK-LABEL: define dso_local noundef i1 @check_isnan
// CHECK-SAME: (float noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 3) #[[ATTR5]]
+// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 3)
// CHECK-NEXT: ret i1 [[TMP0]]
//
_Bool check_isnan(float x) {
@@ -100,7 +100,7 @@ _Bool check_isnan(float x) {
// CHECK-LABEL: define dso_local noundef i1 @check_isinf
// CHECK-SAME: (float noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 516) #[[ATTR5]]
+// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 516)
// CHECK-NEXT: ret i1 [[TMP0]]
//
_Bool check_isinf(float x) {
@@ -111,7 +111,7 @@ _Bool check_isinf(float x) {
// CHECK-LABEL: define dso_local noundef i1 @check_isfinite
// CHECK-SAME: (float noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 504) #[[ATTR5]]
+// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 504)
// CHECK-NEXT: ret i1 [[TMP0]]
//
_Bool check_isfinite(float x) {
@@ -122,7 +122,7 @@ _Bool check_isfinite(float x) {
// CHECK-LABEL: define dso_local noundef i1 @check_isnormal
// CHECK-SAME: (float noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 264) #[[ATTR5]]
+// CHECK-NEXT: [[TMP0:%.*]] = tail call i1 @llvm.is.fpclass.f32(float [[X]], i32 264)
// CHECK-NEXT: ret i1 [[TMP0]]
//
_Bool check_isnormal(float x) {
@@ -150,7 +150,7 @@ int4 check_isfpclass_nan_v4f32(float4 x) {
// CHECK-LABEL: define dso_local range(i32 0, 2) <4 x i32> @check_isfpclass_nan_strict_v4f32
// CHECK-SAME: (<4 x float> noundef [[X:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[TMP0:%.*]] = tail call <4 x i1> @llvm.is.fpclass.v4f32(<4 x float> [[X]], i32 3) #[[ATTR5]]
+// CHECK-NEXT: [[TMP0:%.*]] = tail call <4 x i1> @llvm.is.fpclass.v4f32(<4 x float> [[X]], i32 3)
// CHECK-NEXT: [[TMP1:%.*]] = zext <4 x i1> [[TMP0]] to <4 x i32>
// CHECK-NEXT: ret <4 x i32> [[TMP1]]
//
diff --git a/clang/test/CodeGen/strictfp-elementwise-bulitins.cpp b/clang/test/CodeGen/strictfp-elementwise-bulitins.cpp
index 175ad22601839d..9ba1a557a90fee 100644
--- a/clang/test/CodeGen/strictfp-elementwise-bulitins.cpp
+++ b/clang/test/CodeGen/strictfp-elementwise-bulitins.cpp
@@ -20,7 +20,7 @@ float4 strict_fadd(float4 a, float4 b) {
// CHECK-LABEL: define dso_local noundef <4 x float> @_Z22strict_elementwise_absDv4_f
// CHECK-SAME: (<4 x float> noundef [[A:%.*]]) local_unnamed_addr #[[ATTR2:[0-9]+]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[ELT_ABS:%.*]] = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> [[A]]) #[[ATTR4]]
+// CHECK-NEXT: [[ELT_ABS:%.*]] = tail call <4 x float> @llvm.fabs.v4f32(<4 x float> [[A]])
// CHECK-NEXT: ret <4 x float> [[ELT_ABS]]
//
float4 strict_elementwise_abs(float4 a) {
@@ -50,7 +50,7 @@ float4 strict_elementwise_min(float4 a, float4 b) {
// CHECK-LABEL: define dso_local noundef <4 x float> @_Z26strict_elementwise_maximumDv4_fS_
// CHECK-SAME: (<4 x float> noundef [[A:%.*]], <4 x float> noundef [[B:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[ELT_MAXIMUM:%.*]] = tail call <4 x float> @llvm.maximum.v4f32(<4 x float> [[A]], <4 x float> [[B]]) #[[ATTR4]]
+// CHECK-NEXT: [[ELT_MAXIMUM:%.*]] = tail call <4 x float> @llvm.maximum.v4f32(<4 x float> [[A]], <4 x float> [[B]])
// CHECK-NEXT: ret <4 x float> [[ELT_MAXIMUM]]
//
float4 strict_elementwise_maximum(float4 a, float4 b) {
@@ -60,7 +60,7 @@ float4 strict_elementwise_maximum(float4 a, float4 b) {
// CHECK-LABEL: define dso_local noundef <4 x float> @_Z26strict_elementwise_minimumDv4_fS_
// CHECK-SAME: (<4 x float> noundef [[A:%.*]], <4 x float> noundef [[B:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT: [[ELT_MINIMUM:%.*]] = tail call <4 x float> @llvm.minimum.v4f32(<4 x float> [[A]], <4 x float> [[B]]) #[[ATTR4]]
+// CHECK-NEXT: [[ELT_MINIMUM:%.*]] = tail call <4 x float> @llvm.minimum.v4f32(<4 x float> [[A]], <4 x float> [[B]])
// CHECK-NEXT: ret <4 x float> [[ELT_MINIMUM]]
//
float4 strict_elementwise_minimum(float4 a, float4 b) {
@@ -70,7 +70,7 @@ float4 strict_elementwise_minimum(float4 a, float4 b) {
// CHECK-LABEL: define dso_local noundef <4 x float> @_Z23strict_elementwise_ceilDv4_f
// CHECK-SAME: (<4 x float> noundef [[A:%.*]]) local_unnamed_addr #[[ATTR2]] {
// CHECK-NEXT: entry:
-// CHECK-NEXT:...
[truncated]
|
This patch was obtained from the commit "Don't set strictfp on irrelevant calls" from #109798. |
✅ With the latest revision this PR passed the C/C++ code formatter. |
llvm/lib/IR/IntrinsicInst.cpp
Outdated
switch (IID) { | ||
#define DAG_INSTRUCTION(NAME, NARG, ROUND_MODE, INTRINSIC, DAGN) \ | ||
case Intrinsic::INTRINSIC: | ||
#include "llvm/IR/ConstrainedOps.def" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem right. We have intrinsics to explicitly access the FP environment (llvm.set.rounding etc.). And any intrinsic which can call user code can access the FP environment.
You might need to specify this in the .td files (IntrStrictFP or something like that).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. I changed the implementation to use access to InaccessibleMemory. It is not precise, because this access type is used by other intrinsics too. In future we could add new type of "memory" to represent FP control and status bit and make this check precise.
llvm/lib/IR/IntrinsicInst.cpp
Outdated
bool IntrinsicInst::canAccessFPEnvironment(LLVMContext &C, Intrinsic::ID IID) { | ||
AttributeList Attrs = Intrinsic::getAttributes(C, IID); | ||
MemoryEffects ME = Attrs.getMemoryEffects(); | ||
return ME.onlyAccessesInaccessibleMem(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want doesAccessInaccessibleMem() (any function that can access inaccessible memory, not just functions which only access inaccessible memory). Which apparently doesn't exist at the moment, but should be straightforward to implement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, fixed, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't a large fraction of existing functions that may or may not have any floating point in them qualify as doesAccessInaccessibleMem()? It seems like if we're going to model the FP environment like "memory" then we need an FP environment modeling flag proper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one minor comment
llvm/lib/IR/IRBuilder.cpp
Outdated
if (IsFPConstrained) { | ||
if (const auto *Func = dyn_cast<Function>(Callee)) { | ||
// Some intrinsic functions in non-default FP mode must have FP operand | ||
// bundles to indicate a side effect due to read/write FP environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove the reference to "FP operand bundles", since they don't exist at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Please remove the reference to "FP operand bundles", since they don't exist at the moment.
I removed this comment, it came from the bundle implementation and has no value for the current one.
No, this is the wrong approach.... |
A recap of the history of the strictfp attribute. Originally we needed a way to prevent the Inliner from inlining a non-strictfp function into a strictfp function. We added the Then I went to work on the Verifier changes. Originally I implemented it by checking to see if the presence of the attribute on a call site matched the presence on the function definition. Easy. But then I ran into a problem: Checks at call sites to see if an attribute is present will look through to the function declaration. Thus, we had to weaken the rule. Now, the Clearly intrinisics don't get inlined by the Inliner. But what if an intrinsic gets lowered to a call to a real function? Especially if LTO is being used? Will we lose the information that the function now being called needs the Is it still possible to write a check in the Verifier to catch any places where the I'm not convinced we're ready to drop any uses of the |
Thank you for your retrospective view on
Intrinsic function is what compiler knows a priori. It knows its effects and properties, in particular how it interacts with FP environment. If it is implemented as a call to real function (with a body) there are no problem, - an external function call has side effect and it may do anything, including access to FP environment. But if that body is inlined, things get more complicated.
In all cases inlining does not create invalid constructs and additional information is not needed.
I think we need first to refine meaning of |
What's the motivation for this change? Was the strictfp attribute blocking some optimization? My understanding of the attribute is that it only indicates that strict floating-point semantics are (potentially?) required at the call site. It doesn't indicate that the called function does access the floating-point environment. |
If strict floating-point semantics are required at this call site, they are are required on every relevant call in this function. It means |
I think that takes us back to @kpneal's history, "Then we found that basic blocks were being optimized when they didn't belong to a function so we couldn't get to the where the strictfp attribute was located. The solution was to add the attribute to every function call." To handle the case where a block isn't owned by a function, we need the attribute at the call site. I don't know the specifics of how that case arises, but if we remove the attribute from the call site, we would have to do something to add it again when the block gets detached from the function (possibly during cloning?). |
It sounds like a transient property, used in very specific cases. We could return to this problem later, when/if transition to bundles happens. What about this PR? With it |
And then remove it again when reinserting the BB into a function body? That sounds like a complication, said complication may be missed by someone in the future who isn't focused on FP, and I haven't yet heard a reason we need all of this. @spavloff -- Why do we need to change how we handle the strictfp attribute? What's the benefit? |
Revisiting decisions made in the past is, in general, a good thing. Sometimes (frequently?) it doesn't result in any change, but that doesn't make it any less worthwhile.
I think efforts to change the rules around the |
Thank you for the clarification. Let's keep the meaning of this attribute as it currently stands. The explanations raises some concern, though. OK, strictfp at call sites keeps previous property of the call even if the basic block is detached. If a block is taken from a strictfp function, it should only be placed within another strictfp function. Default mode functions have stricter requirements regarding their bodies; for instance, they cannot call functions such as fesetround. Strictfp functions have less limitations. Transferring code from a strictfp to default mode function in general cases seems incorrect. Does the case of detached block really needs strictfp calls? Additionally, it's unfortunate that we use the same attribute for functions requiring strictfp semantics for their body, and for calls that originate from a strictfp function. |
This change was discussed in llvm#122735. As it was decided, current usage of strictfp on call sites remains intact.
RBuilder incorrectry assigned the attribute "strictfp" to any intrinsic function call in some cases inside strictfp functions. Accordinf to the documentation in https://llvm.org/docs/LangRef.html#function-attributes:
Intrinsic functions like
llvm.fabs
do not depend on rounding mode, they also never raise any FP exception, so these restrictions make no sense for them. If a function interacts with FP environment, it must be either a constrained intrinsic or an external function.