Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flang][OpenMP] Generalize fixing alloca IP pre-condition for private ops #122866

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ergawy
Copy link
Member

@ergawy ergawy commented Jan 14, 2025

This PR generalizes a fix that we implemented previously for omp.wsloops. The fix makes sure the pre-condtion that the alloca block has a single successor whenever we inline delayed privatizers is respected. I simply moved the fix to allocatePrivateVars so that it kicks in for any op not just omp.wsloop.

This handles a bug uncovered by a test in the OpenMP_VV test suite.

…ate` ops

This PR generalizes a fix that we implemented previously for
`omp.wsloop`s. The fix makes sure the pre-condtion that the `alloca`
block has a single successor whenever we inline delayed privatizers is
respected. I simply moved the fix to `allocatePrivateVars` so that it
kicks in for any op not just `omp.wsloop`.

This handles a bug uncovered by [a test](https://github.com/OpenMP-Validation-and-Verification/OpenMP_VV/blob/master/tests/4.5/target_simd/test_target_simd_safelen.F90) in the OpenMP_VV test suite.
@llvmbot
Copy link
Member

llvmbot commented Jan 14, 2025

@llvm/pr-subscribers-mlir-openmp
@llvm/pr-subscribers-mlir-llvm

@llvm/pr-subscribers-flang-openmp

Author: Kareem Ergawy (ergawy)

Changes

This PR generalizes a fix that we implemented previously for omp.wsloops. The fix makes sure the pre-condtion that the alloca block has a single successor whenever we inline delayed privatizers is respected. I simply moved the fix to allocatePrivateVars so that it kicks in for any op not just omp.wsloop.

This handles a bug uncovered by a test in the OpenMP_VV test suite.


Full diff: https://github.com/llvm/llvm-project/pull/122866.diff

2 Files Affected:

  • (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+11-6)
  • (added) mlir/test/Target/LLVMIR/openmp-target-simd-on_device.mlir (+34)
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index d6112fa9af1182..6ffe169038c54e 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -1340,13 +1340,23 @@ allocatePrivateVars(llvm::IRBuilderBase &builder,
                     llvm::SmallVectorImpl<llvm::Value *> &llvmPrivateVars,
                     const llvm::OpenMPIRBuilder::InsertPointTy &allocaIP,
                     llvm::DenseMap<Value, Value> *mappedPrivateVars = nullptr) {
-  llvm::IRBuilderBase::InsertPointGuard guard(builder);
   // Allocate private vars
   llvm::BranchInst *allocaTerminator =
       llvm::cast<llvm::BranchInst>(allocaIP.getBlock()->getTerminator());
+  if (allocaTerminator->getNumSuccessors() != 1) {
+    splitBB(llvm::OpenMPIRBuilder::InsertPointTy(
+                allocaIP.getBlock(), allocaTerminator->getIterator()),
+            true, "omp.region.after_alloca");
+  }
+
+  llvm::IRBuilderBase::InsertPointGuard guard(builder);
+  // Update the allocaTerminator in case the alloca block was split above.
+  allocaTerminator =
+      llvm::cast<llvm::BranchInst>(allocaIP.getBlock()->getTerminator());
   builder.SetInsertPoint(allocaTerminator);
   assert(allocaTerminator->getNumSuccessors() == 1 &&
          "This is an unconditional branch created by OpenMPIRBuilder");
+
   llvm::BasicBlock *afterAllocas = allocaTerminator->getSuccessor(0);
 
   // FIXME: Some of the allocation regions do more than just allocating.
@@ -1876,11 +1886,6 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder,
   SmallVector<llvm::Value *> privateReductionVariables(
       wsloopOp.getNumReductionVars());
 
-  splitBB(llvm::OpenMPIRBuilder::InsertPointTy(
-              allocaIP.getBlock(),
-              allocaIP.getBlock()->getTerminator()->getIterator()),
-          true, "omp.region.after_alloca");
-
   llvm::Expected<llvm::BasicBlock *> afterAllocas = allocatePrivateVars(
       builder, moduleTranslation, privateBlockArgs, privateDecls,
       mlirPrivateVars, llvmPrivateVars, allocaIP);
diff --git a/mlir/test/Target/LLVMIR/openmp-target-simd-on_device.mlir b/mlir/test/Target/LLVMIR/openmp-target-simd-on_device.mlir
new file mode 100644
index 00000000000000..0ce90578ea9d62
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-target-simd-on_device.mlir
@@ -0,0 +1,34 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+module attributes {omp.is_target_device = true} {
+  omp.private {type = private} @simd_privatizer : !llvm.ptr alloc {
+  ^bb0(%arg0: !llvm.ptr):
+    omp.yield(%arg0 : !llvm.ptr)
+  }
+
+  llvm.func @test_target_simd() {
+    omp.target {
+      %5 = llvm.mlir.constant(1 : i32) : i32
+      %x = llvm.alloca %5 x i32 {bindc_name = "x"} : (i32) -> !llvm.ptr
+      omp.simd private(@simd_privatizer %x -> %arg1 : !llvm.ptr) {
+        omp.loop_nest (%arg2) : i32 = (%5) to (%5) inclusive step (%5) {
+          omp.yield
+        }
+      }
+      omp.terminator
+    }
+    llvm.return
+  }
+
+}
+
+// CHECK-LABEL: define {{.*}} @__omp_offloading_{{.*}}_test_target_simd_{{.*}}
+
+// CHECK:         %[[INT:.*]] = alloca i32, align 4
+// CHECK:         br label %[[LATE_ALLOC_BB:.*]]
+
+// CHECK:       [[LATE_ALLOC_BB]]:
+// CHECK:         br label %[[AFTER_ALLOC_BB:.*]]
+
+// CHECK:       [[AFTER_ALLOC_BB]]:
+// CHECK:         br i1 %{{.*}}, label %{{.*}}, label %{{.*}}

@llvmbot
Copy link
Member

llvmbot commented Jan 14, 2025

@llvm/pr-subscribers-mlir

Author: Kareem Ergawy (ergawy)

Changes

This PR generalizes a fix that we implemented previously for omp.wsloops. The fix makes sure the pre-condtion that the alloca block has a single successor whenever we inline delayed privatizers is respected. I simply moved the fix to allocatePrivateVars so that it kicks in for any op not just omp.wsloop.

This handles a bug uncovered by a test in the OpenMP_VV test suite.


Full diff: https://github.com/llvm/llvm-project/pull/122866.diff

2 Files Affected:

  • (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+11-6)
  • (added) mlir/test/Target/LLVMIR/openmp-target-simd-on_device.mlir (+34)
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index d6112fa9af1182..6ffe169038c54e 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -1340,13 +1340,23 @@ allocatePrivateVars(llvm::IRBuilderBase &builder,
                     llvm::SmallVectorImpl<llvm::Value *> &llvmPrivateVars,
                     const llvm::OpenMPIRBuilder::InsertPointTy &allocaIP,
                     llvm::DenseMap<Value, Value> *mappedPrivateVars = nullptr) {
-  llvm::IRBuilderBase::InsertPointGuard guard(builder);
   // Allocate private vars
   llvm::BranchInst *allocaTerminator =
       llvm::cast<llvm::BranchInst>(allocaIP.getBlock()->getTerminator());
+  if (allocaTerminator->getNumSuccessors() != 1) {
+    splitBB(llvm::OpenMPIRBuilder::InsertPointTy(
+                allocaIP.getBlock(), allocaTerminator->getIterator()),
+            true, "omp.region.after_alloca");
+  }
+
+  llvm::IRBuilderBase::InsertPointGuard guard(builder);
+  // Update the allocaTerminator in case the alloca block was split above.
+  allocaTerminator =
+      llvm::cast<llvm::BranchInst>(allocaIP.getBlock()->getTerminator());
   builder.SetInsertPoint(allocaTerminator);
   assert(allocaTerminator->getNumSuccessors() == 1 &&
          "This is an unconditional branch created by OpenMPIRBuilder");
+
   llvm::BasicBlock *afterAllocas = allocaTerminator->getSuccessor(0);
 
   // FIXME: Some of the allocation regions do more than just allocating.
@@ -1876,11 +1886,6 @@ convertOmpWsloop(Operation &opInst, llvm::IRBuilderBase &builder,
   SmallVector<llvm::Value *> privateReductionVariables(
       wsloopOp.getNumReductionVars());
 
-  splitBB(llvm::OpenMPIRBuilder::InsertPointTy(
-              allocaIP.getBlock(),
-              allocaIP.getBlock()->getTerminator()->getIterator()),
-          true, "omp.region.after_alloca");
-
   llvm::Expected<llvm::BasicBlock *> afterAllocas = allocatePrivateVars(
       builder, moduleTranslation, privateBlockArgs, privateDecls,
       mlirPrivateVars, llvmPrivateVars, allocaIP);
diff --git a/mlir/test/Target/LLVMIR/openmp-target-simd-on_device.mlir b/mlir/test/Target/LLVMIR/openmp-target-simd-on_device.mlir
new file mode 100644
index 00000000000000..0ce90578ea9d62
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-target-simd-on_device.mlir
@@ -0,0 +1,34 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+module attributes {omp.is_target_device = true} {
+  omp.private {type = private} @simd_privatizer : !llvm.ptr alloc {
+  ^bb0(%arg0: !llvm.ptr):
+    omp.yield(%arg0 : !llvm.ptr)
+  }
+
+  llvm.func @test_target_simd() {
+    omp.target {
+      %5 = llvm.mlir.constant(1 : i32) : i32
+      %x = llvm.alloca %5 x i32 {bindc_name = "x"} : (i32) -> !llvm.ptr
+      omp.simd private(@simd_privatizer %x -> %arg1 : !llvm.ptr) {
+        omp.loop_nest (%arg2) : i32 = (%5) to (%5) inclusive step (%5) {
+          omp.yield
+        }
+      }
+      omp.terminator
+    }
+    llvm.return
+  }
+
+}
+
+// CHECK-LABEL: define {{.*}} @__omp_offloading_{{.*}}_test_target_simd_{{.*}}
+
+// CHECK:         %[[INT:.*]] = alloca i32, align 4
+// CHECK:         br label %[[LATE_ALLOC_BB:.*]]
+
+// CHECK:       [[LATE_ALLOC_BB]]:
+// CHECK:         br label %[[AFTER_ALLOC_BB:.*]]
+
+// CHECK:       [[AFTER_ALLOC_BB]]:
+// CHECK:         br i1 %{{.*}}, label %{{.*}}, label %{{.*}}

@ergawy
Copy link
Member Author

ergawy commented Jan 15, 2025

Seems like the failed simd tests now pass according to @ajarmusch. Can you guys please ✅ if you do not have any objections to the PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants