Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU][NewPM] Port SILowerControlFlow pass into NPM. #123045

Merged
merged 2 commits into from
Jan 16, 2025

Conversation

cdevadas
Copy link
Collaborator

No description provided.

Copy link
Collaborator Author

cdevadas commented Jan 15, 2025

@cdevadas cdevadas marked this pull request as ready for review January 15, 2025 11:49
@llvmbot
Copy link
Member

llvmbot commented Jan 15, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Christudasan Devadasan (cdevadas)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/123045.diff

8 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPU.h (+2-2)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def (+1)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp (+4-3)
  • (modified) llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp (+48-15)
  • (added) llvm/lib/Target/AMDGPU/SILowerControlFlow.h (+22)
  • (modified) llvm/test/CodeGen/AMDGPU/collapse-endcf.mir (+1)
  • (modified) llvm/test/CodeGen/AMDGPU/lower-control-flow-live-intervals.mir (+1)
  • (modified) llvm/test/CodeGen/AMDGPU/lower-control-flow-other-terminators.mir (+1)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.h b/llvm/lib/Target/AMDGPU/AMDGPU.h
index 400c5f219cc702..89356df39724a4 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.h
@@ -207,8 +207,8 @@ extern char &SILoadStoreOptimizerLegacyID;
 void initializeSIWholeQuadModePass(PassRegistry &);
 extern char &SIWholeQuadModeID;
 
-void initializeSILowerControlFlowPass(PassRegistry &);
-extern char &SILowerControlFlowID;
+void initializeSILowerControlFlowLegacyPass(PassRegistry &);
+extern char &SILowerControlFlowLegacyID;
 
 void initializeSIPreEmitPeepholePass(PassRegistry &);
 extern char &SIPreEmitPeepholeID;
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def
index 6f322074ba74cc..fbcf83e2fdd60b 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPassRegistry.def
@@ -102,6 +102,7 @@ MACHINE_FUNCTION_PASS("si-i1-copies", SILowerI1CopiesPass())
 MACHINE_FUNCTION_PASS("si-fold-operands", SIFoldOperandsPass());
 MACHINE_FUNCTION_PASS("gcn-dpp-combine", GCNDPPCombinePass())
 MACHINE_FUNCTION_PASS("si-load-store-opt", SILoadStoreOptimizerPass())
+MACHINE_FUNCTION_PASS("si-lower-control-flow", SILowerControlFlowPass())
 MACHINE_FUNCTION_PASS("si-lower-sgpr-spills", SILowerSGPRSpillsPass())
 MACHINE_FUNCTION_PASS("si-opt-vgpr-liverange", SIOptimizeVGPRLiveRangePass())
 MACHINE_FUNCTION_PASS("si-peephole-sdwa", SIPeepholeSDWAPass())
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
index 6d4547dbc82c3b..98268b848f5ce6 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
@@ -38,6 +38,7 @@
 #include "SIFixSGPRCopies.h"
 #include "SIFoldOperands.h"
 #include "SILoadStoreOptimizer.h"
+#include "SILowerControlFlow.h"
 #include "SILowerSGPRSpills.h"
 #include "SIMachineFunctionInfo.h"
 #include "SIMachineScheduler.h"
@@ -523,7 +524,7 @@ extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializeAMDGPUTarget() {
   initializeSIInsertWaitcntsPass(*PR);
   initializeSIModeRegisterPass(*PR);
   initializeSIWholeQuadModePass(*PR);
-  initializeSILowerControlFlowPass(*PR);
+  initializeSILowerControlFlowLegacyPass(*PR);
   initializeSIPreEmitPeepholePass(*PR);
   initializeSILateBranchLoweringPass(*PR);
   initializeSIMemoryLegalizerPass(*PR);
@@ -1459,7 +1460,7 @@ void GCNPassConfig::addFastRegAlloc() {
   // This must be run immediately after phi elimination and before
   // TwoAddressInstructions, otherwise the processing of the tied operand of
   // SI_ELSE will introduce a copy of the tied operand source after the else.
-  insertPass(&PHIEliminationID, &SILowerControlFlowID);
+  insertPass(&PHIEliminationID, &SILowerControlFlowLegacyID);
 
   insertPass(&TwoAddressInstructionPassID, &SIWholeQuadModeID);
 
@@ -1480,7 +1481,7 @@ void GCNPassConfig::addOptimizedRegAlloc() {
   // This must be run immediately after phi elimination and before
   // TwoAddressInstructions, otherwise the processing of the tied operand of
   // SI_ELSE will introduce a copy of the tied operand source after the else.
-  insertPass(&PHIEliminationID, &SILowerControlFlowID);
+  insertPass(&PHIEliminationID, &SILowerControlFlowLegacyID);
 
   if (EnableRewritePartialRegUses)
     insertPass(&RenameIndependentSubregsID, &GCNRewritePartialRegUsesID);
diff --git a/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp b/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
index ed5390b96ed4b5..f8878f32f829d1 100644
--- a/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
+++ b/llvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
@@ -48,6 +48,7 @@
 /// %exec = S_OR_B64 %exec, %sgpr0     // Re-enable saved exec mask bits
 //===----------------------------------------------------------------------===//
 
+#include "SILowerControlFlow.h"
 #include "AMDGPU.h"
 #include "GCNSubtarget.h"
 #include "MCTargetDesc/AMDGPUMCTargetDesc.h"
@@ -68,7 +69,7 @@ RemoveRedundantEndcf("amdgpu-remove-redundant-endcf",
 
 namespace {
 
-class SILowerControlFlow : public MachineFunctionPass {
+class SILowerControlFlow {
 private:
   const SIRegisterInfo *TRI = nullptr;
   const SIInstrInfo *TII = nullptr;
@@ -135,10 +136,18 @@ class SILowerControlFlow : public MachineFunctionPass {
   // Remove redundant SI_END_CF instructions.
   void optimizeEndCf();
 
+public:
+  SILowerControlFlow(LiveIntervals *LIS, LiveVariables *LV,
+                     MachineDominatorTree *MDT)
+      : LIS(LIS), LV(LV), MDT(MDT) {}
+  bool run(MachineFunction &MF);
+};
+
+class SILowerControlFlowLegacy : public MachineFunctionPass {
 public:
   static char ID;
 
-  SILowerControlFlow() : MachineFunctionPass(ID) {}
+  SILowerControlFlowLegacy() : MachineFunctionPass(ID) {}
 
   bool runOnMachineFunction(MachineFunction &MF) override;
 
@@ -159,10 +168,10 @@ class SILowerControlFlow : public MachineFunctionPass {
 
 } // end anonymous namespace
 
-char SILowerControlFlow::ID = 0;
+char SILowerControlFlowLegacy::ID = 0;
 
-INITIALIZE_PASS(SILowerControlFlow, DEBUG_TYPE,
-               "SI lower control flow", false, false)
+INITIALIZE_PASS(SILowerControlFlowLegacy, DEBUG_TYPE, "SI lower control flow",
+                false, false)
 
 static void setImpSCCDefDead(MachineInstr &MI, bool IsDead) {
   MachineOperand &ImpDefSCC = MI.getOperand(3);
@@ -171,7 +180,7 @@ static void setImpSCCDefDead(MachineInstr &MI, bool IsDead) {
   ImpDefSCC.setIsDead(IsDead);
 }
 
-char &llvm::SILowerControlFlowID = SILowerControlFlow::ID;
+char &llvm::SILowerControlFlowLegacyID = SILowerControlFlowLegacy::ID;
 
 bool SILowerControlFlow::hasKill(const MachineBasicBlock *Begin,
                                  const MachineBasicBlock *End) {
@@ -753,21 +762,13 @@ bool SILowerControlFlow::removeMBBifRedundant(MachineBasicBlock &MBB) {
   return true;
 }
 
-bool SILowerControlFlow::runOnMachineFunction(MachineFunction &MF) {
+bool SILowerControlFlow::run(MachineFunction &MF) {
   const GCNSubtarget &ST = MF.getSubtarget<GCNSubtarget>();
   TII = ST.getInstrInfo();
   TRI = &TII->getRegisterInfo();
   EnableOptimizeEndCf = RemoveRedundantEndcf &&
                         MF.getTarget().getOptLevel() > CodeGenOptLevel::None;
 
-  // This doesn't actually need LiveIntervals, but we can preserve them.
-  auto *LISWrapper = getAnalysisIfAvailable<LiveIntervalsWrapperPass>();
-  LIS = LISWrapper ? &LISWrapper->getLIS() : nullptr;
-  // This doesn't actually need LiveVariables, but we can preserve them.
-  auto *LVWrapper = getAnalysisIfAvailable<LiveVariablesWrapperPass>();
-  LV = LVWrapper ? &LVWrapper->getLV() : nullptr;
-  auto *MDTWrapper = getAnalysisIfAvailable<MachineDominatorTreeWrapperPass>();
-  MDT = MDTWrapper ? &MDTWrapper->getDomTree() : nullptr;
   MRI = &MF.getRegInfo();
   BoolRC = TRI->getBoolRC();
 
@@ -864,3 +865,35 @@ bool SILowerControlFlow::runOnMachineFunction(MachineFunction &MF) {
 
   return Changed;
 }
+
+bool SILowerControlFlowLegacy::runOnMachineFunction(MachineFunction &MF) {
+  // This doesn't actually need LiveIntervals, but we can preserve them.
+  auto *LISWrapper = getAnalysisIfAvailable<LiveIntervalsWrapperPass>();
+  LiveIntervals *LIS = LISWrapper ? &LISWrapper->getLIS() : nullptr;
+  // This doesn't actually need LiveVariables, but we can preserve them.
+  auto *LVWrapper = getAnalysisIfAvailable<LiveVariablesWrapperPass>();
+  LiveVariables *LV = LVWrapper ? &LVWrapper->getLV() : nullptr;
+  auto *MDTWrapper = getAnalysisIfAvailable<MachineDominatorTreeWrapperPass>();
+  MachineDominatorTree *MDT = MDTWrapper ? &MDTWrapper->getDomTree() : nullptr;
+  return SILowerControlFlow(LIS, LV, MDT).run(MF);
+}
+
+PreservedAnalyses
+SILowerControlFlowPass::run(MachineFunction &MF,
+                            MachineFunctionAnalysisManager &MFAM) {
+  LiveIntervals *LIS = MFAM.getCachedResult<LiveIntervalsAnalysis>(MF);
+  LiveVariables *LV = MFAM.getCachedResult<LiveVariablesAnalysis>(MF);
+  MachineDominatorTree *MDT =
+      MFAM.getCachedResult<MachineDominatorTreeAnalysis>(MF);
+
+  bool Changed = SILowerControlFlow(LIS, LV, MDT).run(MF);
+  if (!Changed)
+    return PreservedAnalyses::all();
+
+  auto PA = getMachineFunctionPassPreservedAnalyses();
+  PA.preserve<MachineDominatorTreeAnalysis>();
+  PA.preserve<SlotIndexesAnalysis>();
+  PA.preserve<LiveIntervalsAnalysis>();
+  PA.preserve<LiveVariablesAnalysis>();
+  return PA;
+}
diff --git a/llvm/lib/Target/AMDGPU/SILowerControlFlow.h b/llvm/lib/Target/AMDGPU/SILowerControlFlow.h
new file mode 100644
index 00000000000000..136fd49daf7240
--- /dev/null
+++ b/llvm/lib/Target/AMDGPU/SILowerControlFlow.h
@@ -0,0 +1,22 @@
+//===- SILowerSGPRSpills.h --------------------------------------*- C++- *-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIB_TARGET_AMDGPU_SILOWERCONTROLFLOW_H
+#define LLVM_LIB_TARGET_AMDGPU_SILOWERCONTROLFLOW_H
+
+#include "llvm/CodeGen/MachinePassManager.h"
+
+namespace llvm {
+class SILowerControlFlowPass : public PassInfoMixin<SILowerControlFlowPass> {
+public:
+  PreservedAnalyses run(MachineFunction &MF,
+                        MachineFunctionAnalysisManager &MFAM);
+};
+} // namespace llvm
+
+#endif // LLVM_LIB_TARGET_AMDGPU_SILOWERCONTROLFLOW_H
diff --git a/llvm/test/CodeGen/AMDGPU/collapse-endcf.mir b/llvm/test/CodeGen/AMDGPU/collapse-endcf.mir
index 48ca53732ed061..b278bfca7f7a33 100644
--- a/llvm/test/CodeGen/AMDGPU/collapse-endcf.mir
+++ b/llvm/test/CodeGen/AMDGPU/collapse-endcf.mir
@@ -1,5 +1,6 @@
 # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
 # RUN: llc -mtriple=amdgcn -verify-machineinstrs -run-pass=si-lower-control-flow -amdgpu-remove-redundant-endcf %s -o - | FileCheck -check-prefix=GCN %s
+# RUN: llc -mtriple=amdgcn -passes=si-lower-control-flow -amdgpu-remove-redundant-endcf %s -o - | FileCheck -check-prefix=GCN %s
 
 # Make sure dbg_value doesn't change codeegn when collapsing end_cf
 ---
diff --git a/llvm/test/CodeGen/AMDGPU/lower-control-flow-live-intervals.mir b/llvm/test/CodeGen/AMDGPU/lower-control-flow-live-intervals.mir
index 9eeec4fa3a93d1..d156a0aef6c17e 100644
--- a/llvm/test/CodeGen/AMDGPU/lower-control-flow-live-intervals.mir
+++ b/llvm/test/CodeGen/AMDGPU/lower-control-flow-live-intervals.mir
@@ -1,5 +1,6 @@
 # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 3
 # RUN: llc -run-pass=liveintervals -run-pass=si-lower-control-flow -mtriple=amdgcn--amdpal -mcpu=gfx1030 -verify-machineinstrs -o - %s | FileCheck %s
+# RUN: llc -passes='require<live-intervals>,si-lower-control-flow' -mtriple=amdgcn--amdpal -mcpu=gfx1030 -o - %s | FileCheck %s
 
 # Check that verifier passes for the following.
 
diff --git a/llvm/test/CodeGen/AMDGPU/lower-control-flow-other-terminators.mir b/llvm/test/CodeGen/AMDGPU/lower-control-flow-other-terminators.mir
index 914cc8ae8844cb..eaf398fd517239 100644
--- a/llvm/test/CodeGen/AMDGPU/lower-control-flow-other-terminators.mir
+++ b/llvm/test/CodeGen/AMDGPU/lower-control-flow-other-terminators.mir
@@ -1,5 +1,6 @@
 # NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
 # RUN: llc -mtriple=amdgcn -mcpu=fiji -verify-machineinstrs -run-pass=si-lower-control-flow -o - %s | FileCheck %s
+# RUN: llc -mtriple=amdgcn -mcpu=fiji -passes=si-lower-control-flow -o - %s | FileCheck %s
 
 # Test si-lower-control-flow insertion points when other terminator
 # instructions are present besides the control flow pseudo and a

@cdevadas
Copy link
Collaborator Author

I observed something while porting this pass.
The analysis LiveIntervals (LIS) uses the SlotIndexes (SI). There is no explicit use of SI in this pass. If we have to preserve LIS, it required us to preserve SI as well. When I initially failed to preserve SI, the following lit test went into an infinite loop (the cleanup phase hung).
llvm/test/CodeGen/AMDGPU/lower-control-flow-live-intervals.mir.
Then I realized that the Legacy pass flow preserved SI in getAnalysisUsage().
Is it expected to preserve all the dependent analyses alongside a parent analysis?

@jayfoad
Copy link
Contributor

jayfoad commented Jan 15, 2025

I observed something while porting this pass. The analysis LiveIntervals (LIS) uses the SlotIndexes (SI). There is no explicit use of SI in this pass. If we have to preserve LIS, it required us to preserve SI as well. When I initially failed to preserve SI, the following lit test went into an infinite loop (the cleanup phase hung). llvm/test/CodeGen/AMDGPU/lower-control-flow-live-intervals.mir. Then I realized that the Legacy pass flow preserved SI in getAnalysisUsage(). Is it expected to preserve all the dependent analyses alongside a parent analysis?

In the old pass manager this is kind of an unwritten rule, and if you break the rule you tend to get assertion failures in the pass manager itself. See: See https://discourse.llvm.org/t/legacypm-preserving-passes-and-addrequiredtransitive/57043

I don't know what the rules are for the new pass manager.

@cdevadas
Copy link
Collaborator Author

cdevadas commented Jan 15, 2025

I don't know what the rules are for the new pass manager.

Thanks @jayfoad. The assertion would definitely hint to the users what's amiss here. But the hang was quite baffling and difficult to track down. If the NPM can't automatically determine the dependent analyses and get them all preserved, we should at least have this assertion in place to avoid such strange behaviors.

@paperchalice
Copy link
Contributor

paperchalice commented Jan 15, 2025

In new pass manager it depends on SomeAnalysis::Result::invalidate. See https://llvm.org/docs/NewPassManager.html#implementing-analysis-invalidation
An analysis can always check related results by using PreservedAnalysisChecker in invalidate.
Currently LiveIntervals and SlotIndexes are using the default invalidate implementation.

@cdevadas
Copy link
Collaborator Author

Currently LiveIntervals and SlotIndexes are using the default invalidate implementation.

Ok, we should be able to fix it by overriding this instance in the LIS and SI. However, the initial question is still valid. Can we have meaningful error reporting when one fails to capture this dependency?

Copy link
Collaborator Author

cdevadas commented Jan 16, 2025

Merge activity

  • Jan 16, 12:30 AM EST: A user started a stack merge that includes this pull request via Graphite.
  • Jan 16, 12:34 AM EST: Graphite rebased this pull request as part of a merge.
  • Jan 16, 12:36 AM EST: A user merged this pull request with Graphite.

@cdevadas cdevadas force-pushed the users/cdevadas/minor-fixup-si-lower-cf-pass branch from a244f33 to 590e6e6 Compare January 16, 2025 05:31
Base automatically changed from users/cdevadas/minor-fixup-si-lower-cf-pass to main January 16, 2025 05:33
@cdevadas cdevadas force-pushed the users/cdevadas/port-si-lower-cf-pass-to-npm branch from af06d59 to 91f9530 Compare January 16, 2025 05:33
@cdevadas cdevadas merged commit 1797fb6 into main Jan 16, 2025
5 of 7 checks passed
@cdevadas cdevadas deleted the users/cdevadas/port-si-lower-cf-pass-to-npm branch January 16, 2025 05:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants