MLIR  19.0.0git
Mem2Reg.cpp
Go to the documentation of this file.
1 //===- Mem2Reg.cpp - Promotes memory slots into values ----------*- C++ -*-===//
2 //
3 // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4 // See https://llvm.org/LICENSE.txt for license information.
5 // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6 //
7 //===----------------------------------------------------------------------===//
8 
13 #include "mlir/IR/Builders.h"
14 #include "mlir/IR/Dominance.h"
15 #include "mlir/IR/PatternMatch.h"
16 #include "mlir/IR/Value.h"
19 #include "mlir/Transforms/Passes.h"
20 #include "llvm/ADT/STLExtras.h"
21 #include "llvm/Support/GenericIteratedDominanceFrontier.h"
22 
23 namespace mlir {
24 #define GEN_PASS_DEF_MEM2REG
25 #include "mlir/Transforms/Passes.h.inc"
26 } // namespace mlir
27 
28 #define DEBUG_TYPE "mem2reg"
29 
30 using namespace mlir;
31 
32 /// mem2reg
33 ///
34 /// This pass turns unnecessary uses of automatically allocated memory slots
35 /// into direct Value-based operations. For example, it will simplify storing a
36 /// constant in a memory slot to immediately load it to a direct use of that
37 /// constant. In other words, given a memory slot addressed by a non-aliased
38 /// "pointer" Value, mem2reg removes all the uses of that pointer.
39 ///
40 /// Within a block, this is done by following the chain of stores and loads of
41 /// the slot and replacing the results of loads with the values previously
42 /// stored. If a load happens before any other store, a poison value is used
43 /// instead.
44 ///
45 /// Control flow can create situations where a load could be replaced by
46 /// multiple possible stores depending on the control flow path taken. As a
47 /// result, this pass must introduce new block arguments in some blocks to
48 /// accommodate for the multiple possible definitions. Each predecessor will
49 /// populate the block argument with the definition reached at its end. With
50 /// this, the value stored can be well defined at block boundaries, allowing
51 /// the propagation of replacement through blocks.
52 ///
53 /// This pass computes this transformation in four main steps. The two first
54 /// steps are performed during an analysis phase that does not mutate IR.
55 ///
56 /// The two steps of the analysis phase are the following:
57 /// - A first step computes the list of operations that transitively use the
58 /// memory slot we would like to promote. The purpose of this phase is to
59 /// identify which uses must be removed to promote the slot, either by rewiring
60 /// the user or deleting it. Naturally, direct uses of the slot must be removed.
61 /// Sometimes additional uses must also be removed: this is notably the case
62 /// when a direct user of the slot cannot rewire its use and must delete itself,
63 /// and thus must make its users no longer use it. If any of those uses cannot
64 /// be removed by their users in any way, promotion cannot continue: this is
65 /// decided at this step.
66 /// - A second step computes the list of blocks where a block argument will be
67 /// needed ("merge points") without mutating the IR. These blocks are the blocks
68 /// leading to a definition clash between two predecessors. Such blocks happen
69 /// to be the Iterated Dominance Frontier (IDF) of the set of blocks containing
70 /// a store, as they represent the point where a clear defining dominator stops
71 /// existing. Computing this information in advance allows making sure the
72 /// terminators that will forward values are capable of doing so (inability to
73 /// do so aborts promotion at this step).
74 ///
75 /// At this point, promotion is guaranteed to happen, and the mutation phase can
76 /// begin with the following steps:
77 /// - A third step computes the reaching definition of the memory slot at each
78 /// blocking user. This is the core of the mem2reg algorithm, also known as
79 /// load-store forwarding. This analyses loads and stores and propagates which
80 /// value must be stored in the slot at each blocking user. This is achieved by
81 /// doing a depth-first walk of the dominator tree of the function. This is
82 /// sufficient because the reaching definition at the beginning of a block is
83 /// either its new block argument if it is a merge block, or the definition
84 /// reaching the end of its immediate dominator (parent in the dominator tree).
85 /// We can therefore propagate this information down the dominator tree to
86 /// proceed with renaming within blocks.
87 /// - The final fourth step uses the reaching definition to remove blocking uses
88 /// in topological order.
89 ///
90 /// For further reading, chapter three of SSA-based Compiler Design [1]
91 /// showcases SSA construction, where mem2reg is an adaptation of the same
92 /// process.
93 ///
94 /// [1]: Rastello F. & Bouchez Tichadou F., SSA-based Compiler Design (2022),
95 /// Springer.
96 
97 namespace {
98 
99 using BlockingUsesMap =
100  llvm::MapVector<Operation *, SmallPtrSet<OpOperand *, 4>>;
101 
102 /// Information computed during promotion analysis used to perform actual
103 /// promotion.
104 struct MemorySlotPromotionInfo {
105  /// Blocks for which at least two definitions of the slot values clash.
106  SmallPtrSet<Block *, 8> mergePoints;
107  /// Contains, for each operation, which uses must be eliminated by promotion.
108  /// This is a DAG structure because if an operation must eliminate some of
109  /// its uses, it is because the defining ops of the blocking uses requested
110  /// it. The defining ops therefore must also have blocking uses or be the
111  /// starting point of the blocking uses.
112  BlockingUsesMap userToBlockingUses;
113 };
114 
115 /// Computes information for basic slot promotion. This will check that direct
116 /// slot promotion can be performed, and provide the information to execute the
117 /// promotion. This does not mutate IR.
118 class MemorySlotPromotionAnalyzer {
119 public:
120  MemorySlotPromotionAnalyzer(MemorySlot slot, DominanceInfo &dominance,
121  const DataLayout &dataLayout)
122  : slot(slot), dominance(dominance), dataLayout(dataLayout) {}
123 
124  /// Computes the information for slot promotion if promotion is possible,
125  /// returns nothing otherwise.
126  std::optional<MemorySlotPromotionInfo> computeInfo();
127 
128 private:
129  /// Computes the transitive uses of the slot that block promotion. This finds
130  /// uses that would block the promotion, checks that the operation has a
131  /// solution to remove the blocking use, and potentially forwards the analysis
132  /// if the operation needs further blocking uses resolved to resolve its own
133  /// uses (typically, removing its users because it will delete itself to
134  /// resolve its own blocking uses). This will fail if one of the transitive
135  /// users cannot remove a requested use, and should prevent promotion.
136  LogicalResult computeBlockingUses(BlockingUsesMap &userToBlockingUses);
137 
138  /// Computes in which blocks the value stored in the slot is actually used,
139  /// meaning blocks leading to a load. This method uses `definingBlocks`, the
140  /// set of blocks containing a store to the slot (defining the value of the
141  /// slot).
143  computeSlotLiveIn(SmallPtrSetImpl<Block *> &definingBlocks);
144 
145  /// Computes the points in which multiple re-definitions of the slot's value
146  /// (stores) may conflict.
147  void computeMergePoints(SmallPtrSetImpl<Block *> &mergePoints);
148 
149  /// Ensures predecessors of merge points can properly provide their current
150  /// definition of the value stored in the slot to the merge point. This can
151  /// notably be an issue if the terminator used does not have the ability to
152  /// forward values through block operands.
153  bool areMergePointsUsable(SmallPtrSetImpl<Block *> &mergePoints);
154 
155  MemorySlot slot;
156  DominanceInfo &dominance;
157  const DataLayout &dataLayout;
158 };
159 
160 using BlockIndexCache = DenseMap<Region *, DenseMap<Block *, size_t>>;
161 
162 /// The MemorySlotPromoter handles the state of promoting a memory slot. It
163 /// wraps a slot and its associated allocator. This will perform the mutation of
164 /// IR.
165 class MemorySlotPromoter {
166 public:
167  MemorySlotPromoter(MemorySlot slot, PromotableAllocationOpInterface allocator,
168  OpBuilder &builder, DominanceInfo &dominance,
169  const DataLayout &dataLayout, MemorySlotPromotionInfo info,
170  const Mem2RegStatistics &statistics,
171  BlockIndexCache &blockIndexCache);
172 
173  /// Actually promotes the slot by mutating IR. Promoting a slot DOES
174  /// invalidate the MemorySlotPromotionInfo of other slots. Preparation of
175  /// promotion info should NOT be performed in batches.
176  /// Returns a promotable allocation op if a new allocator was created, nullopt
177  /// otherwise.
178  std::optional<PromotableAllocationOpInterface> promoteSlot();
179 
180 private:
181  /// Computes the reaching definition for all the operations that require
182  /// promotion. `reachingDef` is the value the slot should contain at the
183  /// beginning of the block. This method returns the reached definition at the
184  /// end of the block. This method must only be called at most once per block.
185  Value computeReachingDefInBlock(Block *block, Value reachingDef);
186 
187  /// Computes the reaching definition for all the operations that require
188  /// promotion. `reachingDef` corresponds to the initial value the
189  /// slot will contain before any write, typically a poison value.
190  /// This method must only be called at most once per region.
191  void computeReachingDefInRegion(Region *region, Value reachingDef);
192 
193  /// Removes the blocking uses of the slot, in topological order.
194  void removeBlockingUses();
195 
196  /// Lazily-constructed default value representing the content of the slot when
197  /// no store has been executed. This function may mutate IR.
198  Value getOrCreateDefaultValue();
199 
200  MemorySlot slot;
201  PromotableAllocationOpInterface allocator;
202  OpBuilder &builder;
203  /// Potentially non-initialized default value. Use `getOrCreateDefaultValue`
204  /// to initialize it on demand.
205  Value defaultValue;
206  /// Contains the reaching definition at this operation. Reaching definitions
207  /// are only computed for promotable memory operations with blocking uses.
210  DominanceInfo &dominance;
211  const DataLayout &dataLayout;
212  MemorySlotPromotionInfo info;
213  const Mem2RegStatistics &statistics;
214 
215  /// Shared cache of block indices of specific regions.
216  BlockIndexCache &blockIndexCache;
217 };
218 
219 } // namespace
220 
221 MemorySlotPromoter::MemorySlotPromoter(
222  MemorySlot slot, PromotableAllocationOpInterface allocator,
223  OpBuilder &builder, DominanceInfo &dominance, const DataLayout &dataLayout,
224  MemorySlotPromotionInfo info, const Mem2RegStatistics &statistics,
225  BlockIndexCache &blockIndexCache)
226  : slot(slot), allocator(allocator), builder(builder), dominance(dominance),
227  dataLayout(dataLayout), info(std::move(info)), statistics(statistics),
228  blockIndexCache(blockIndexCache) {
229 #ifndef NDEBUG
230  auto isResultOrNewBlockArgument = [&]() {
231  if (BlockArgument arg = dyn_cast<BlockArgument>(slot.ptr))
232  return arg.getOwner()->getParentOp() == allocator;
233  return slot.ptr.getDefiningOp() == allocator;
234  };
235 
236  assert(isResultOrNewBlockArgument() &&
237  "a slot must be a result of the allocator or an argument of the child "
238  "regions of the allocator");
239 #endif // NDEBUG
240 }
241 
242 Value MemorySlotPromoter::getOrCreateDefaultValue() {
243  if (defaultValue)
244  return defaultValue;
245 
246  OpBuilder::InsertionGuard guard(builder);
248  return defaultValue = allocator.getDefaultValue(slot, builder);
249 }
250 
251 LogicalResult MemorySlotPromotionAnalyzer::computeBlockingUses(
252  BlockingUsesMap &userToBlockingUses) {
253  // The promotion of an operation may require the promotion of further
254  // operations (typically, removing operations that use an operation that must
255  // delete itself). We thus need to start from the use of the slot pointer and
256  // propagate further requests through the forward slice.
257 
258  // First insert that all immediate users of the slot pointer must no longer
259  // use it.
260  for (OpOperand &use : slot.ptr.getUses()) {
261  SmallPtrSet<OpOperand *, 4> &blockingUses =
262  userToBlockingUses[use.getOwner()];
263  blockingUses.insert(&use);
264  }
265 
266  // Then, propagate the requirements for the removal of uses. The
267  // topologically-sorted forward slice allows for all blocking uses of an
268  // operation to have been computed before it is reached. Operations are
269  // traversed in topological order of their uses, starting from the slot
270  // pointer.
271  SetVector<Operation *> forwardSlice;
272  mlir::getForwardSlice(slot.ptr, &forwardSlice);
273  for (Operation *user : forwardSlice) {
274  // If the next operation has no blocking uses, everything is fine.
275  if (!userToBlockingUses.contains(user))
276  continue;
277 
278  SmallPtrSet<OpOperand *, 4> &blockingUses = userToBlockingUses[user];
279 
280  SmallVector<OpOperand *> newBlockingUses;
281  // If the operation decides it cannot deal with removing the blocking uses,
282  // promotion must fail.
283  if (auto promotable = dyn_cast<PromotableOpInterface>(user)) {
284  if (!promotable.canUsesBeRemoved(blockingUses, newBlockingUses,
285  dataLayout))
286  return failure();
287  } else if (auto promotable = dyn_cast<PromotableMemOpInterface>(user)) {
288  if (!promotable.canUsesBeRemoved(slot, blockingUses, newBlockingUses,
289  dataLayout))
290  return failure();
291  } else {
292  // An operation that has blocking uses must be promoted. If it is not
293  // promotable, promotion must fail.
294  return failure();
295  }
296 
297  // Then, register any new blocking uses for coming operations.
298  for (OpOperand *blockingUse : newBlockingUses) {
299  assert(llvm::is_contained(user->getResults(), blockingUse->get()));
300 
301  SmallPtrSetImpl<OpOperand *> &newUserBlockingUseSet =
302  userToBlockingUses[blockingUse->getOwner()];
303  newUserBlockingUseSet.insert(blockingUse);
304  }
305  }
306 
307  // Because this pass currently only supports analysing the parent region of
308  // the slot pointer, if a promotable memory op that needs promotion is outside
309  // of this region, promotion must fail because it will be impossible to
310  // provide a valid `reachingDef` for it.
311  for (auto &[toPromote, _] : userToBlockingUses)
312  if (isa<PromotableMemOpInterface>(toPromote) &&
313  toPromote->getParentRegion() != slot.ptr.getParentRegion())
314  return failure();
315 
316  return success();
317 }
318 
319 SmallPtrSet<Block *, 16> MemorySlotPromotionAnalyzer::computeSlotLiveIn(
320  SmallPtrSetImpl<Block *> &definingBlocks) {
322 
323  // The worklist contains blocks in which it is known that the slot value is
324  // live-in. The further blocks where this value is live-in will be inferred
325  // from these.
326  SmallVector<Block *> liveInWorkList;
327 
328  // Blocks with a load before any other store to the slot are the starting
329  // points of the analysis. The slot value is definitely live-in in those
330  // blocks.
331  SmallPtrSet<Block *, 16> visited;
332  for (Operation *user : slot.ptr.getUsers()) {
333  if (visited.contains(user->getBlock()))
334  continue;
335  visited.insert(user->getBlock());
336 
337  for (Operation &op : user->getBlock()->getOperations()) {
338  if (auto memOp = dyn_cast<PromotableMemOpInterface>(op)) {
339  // If this operation loads the slot, it is loading from it before
340  // ever writing to it, so the value is live-in in this block.
341  if (memOp.loadsFrom(slot)) {
342  liveInWorkList.push_back(user->getBlock());
343  break;
344  }
345 
346  // If we store to the slot, further loads will see that value.
347  // Because we did not meet any load before, the value is not live-in.
348  if (memOp.storesTo(slot))
349  break;
350  }
351  }
352  }
353 
354  // The information is then propagated to the predecessors until a def site
355  // (store) is found.
356  while (!liveInWorkList.empty()) {
357  Block *liveInBlock = liveInWorkList.pop_back_val();
358 
359  if (!liveIn.insert(liveInBlock).second)
360  continue;
361 
362  // If a predecessor is a defining block, either:
363  // - It has a load before its first store, in which case it is live-in but
364  // has already been processed in the initialisation step.
365  // - It has a store before any load, in which case it is not live-in.
366  // We can thus at this stage insert to the worklist only predecessors that
367  // are not defining blocks.
368  for (Block *pred : liveInBlock->getPredecessors())
369  if (!definingBlocks.contains(pred))
370  liveInWorkList.push_back(pred);
371  }
372 
373  return liveIn;
374 }
375 
376 using IDFCalculator = llvm::IDFCalculatorBase<Block, false>;
377 void MemorySlotPromotionAnalyzer::computeMergePoints(
378  SmallPtrSetImpl<Block *> &mergePoints) {
379  if (slot.ptr.getParentRegion()->hasOneBlock())
380  return;
381 
382  IDFCalculator idfCalculator(dominance.getDomTree(slot.ptr.getParentRegion()));
383 
384  SmallPtrSet<Block *, 16> definingBlocks;
385  for (Operation *user : slot.ptr.getUsers())
386  if (auto storeOp = dyn_cast<PromotableMemOpInterface>(user))
387  if (storeOp.storesTo(slot))
388  definingBlocks.insert(user->getBlock());
389 
390  idfCalculator.setDefiningBlocks(definingBlocks);
391 
392  SmallPtrSet<Block *, 16> liveIn = computeSlotLiveIn(definingBlocks);
393  idfCalculator.setLiveInBlocks(liveIn);
394 
395  SmallVector<Block *> mergePointsVec;
396  idfCalculator.calculate(mergePointsVec);
397 
398  mergePoints.insert(mergePointsVec.begin(), mergePointsVec.end());
399 }
400 
401 bool MemorySlotPromotionAnalyzer::areMergePointsUsable(
402  SmallPtrSetImpl<Block *> &mergePoints) {
403  for (Block *mergePoint : mergePoints)
404  for (Block *pred : mergePoint->getPredecessors())
405  if (!isa<BranchOpInterface>(pred->getTerminator()))
406  return false;
407 
408  return true;
409 }
410 
411 std::optional<MemorySlotPromotionInfo>
412 MemorySlotPromotionAnalyzer::computeInfo() {
413  MemorySlotPromotionInfo info;
414 
415  // First, find the set of operations that will need to be changed for the
416  // promotion to happen. These operations need to resolve some of their uses,
417  // either by rewiring them or simply deleting themselves. If any of them
418  // cannot find a way to resolve their blocking uses, we abort the promotion.
419  if (failed(computeBlockingUses(info.userToBlockingUses)))
420  return {};
421 
422  // Then, compute blocks in which two or more definitions of the allocated
423  // variable may conflict. These blocks will need a new block argument to
424  // accommodate this.
425  computeMergePoints(info.mergePoints);
426 
427  // The slot can be promoted if the block arguments to be created can
428  // actually be populated with values, which may not be possible depending
429  // on their predecessors.
430  if (!areMergePointsUsable(info.mergePoints))
431  return {};
432 
433  return info;
434 }
435 
436 Value MemorySlotPromoter::computeReachingDefInBlock(Block *block,
437  Value reachingDef) {
438  SmallVector<Operation *> blockOps;
439  for (Operation &op : block->getOperations())
440  blockOps.push_back(&op);
441  for (Operation *op : blockOps) {
442  if (auto memOp = dyn_cast<PromotableMemOpInterface>(op)) {
443  if (info.userToBlockingUses.contains(memOp))
444  reachingDefs.insert({memOp, reachingDef});
445 
446  if (memOp.storesTo(slot)) {
447  builder.setInsertionPointAfter(memOp);
448  Value stored = memOp.getStored(slot, builder, reachingDef, dataLayout);
449  assert(stored && "a memory operation storing to a slot must provide a "
450  "new definition of the slot");
451  reachingDef = stored;
452  replacedValuesMap[memOp] = stored;
453  }
454  }
455  }
456 
457  return reachingDef;
458 }
459 
460 void MemorySlotPromoter::computeReachingDefInRegion(Region *region,
461  Value reachingDef) {
462  assert(reachingDef && "expected an initial reaching def to be provided");
463  if (region->hasOneBlock()) {
464  computeReachingDefInBlock(&region->front(), reachingDef);
465  return;
466  }
467 
468  struct DfsJob {
469  llvm::DomTreeNodeBase<Block> *block;
470  Value reachingDef;
471  };
472 
473  SmallVector<DfsJob> dfsStack;
474 
475  auto &domTree = dominance.getDomTree(slot.ptr.getParentRegion());
476 
477  dfsStack.emplace_back<DfsJob>(
478  {domTree.getNode(&region->front()), reachingDef});
479 
480  while (!dfsStack.empty()) {
481  DfsJob job = dfsStack.pop_back_val();
482  Block *block = job.block->getBlock();
483 
484  if (info.mergePoints.contains(block)) {
485  BlockArgument blockArgument =
486  block->addArgument(slot.elemType, slot.ptr.getLoc());
487  builder.setInsertionPointToStart(block);
488  allocator.handleBlockArgument(slot, blockArgument, builder);
489  job.reachingDef = blockArgument;
490 
491  if (statistics.newBlockArgumentAmount)
492  (*statistics.newBlockArgumentAmount)++;
493  }
494 
495  job.reachingDef = computeReachingDefInBlock(block, job.reachingDef);
496  assert(job.reachingDef);
497 
498  if (auto terminator = dyn_cast<BranchOpInterface>(block->getTerminator())) {
499  for (BlockOperand &blockOperand : terminator->getBlockOperands()) {
500  if (info.mergePoints.contains(blockOperand.get())) {
501  terminator.getSuccessorOperands(blockOperand.getOperandNumber())
502  .append(job.reachingDef);
503  }
504  }
505  }
506 
507  for (auto *child : job.block->children())
508  dfsStack.emplace_back<DfsJob>({child, job.reachingDef});
509  }
510 }
511 
512 /// Gets or creates a block index mapping for `region`.
513 static const DenseMap<Block *, size_t> &
514 getOrCreateBlockIndices(BlockIndexCache &blockIndexCache, Region *region) {
515  auto [it, inserted] = blockIndexCache.try_emplace(region);
516  if (!inserted)
517  return it->second;
518 
519  DenseMap<Block *, size_t> &blockIndices = it->second;
520  SetVector<Block *> topologicalOrder = getBlocksSortedByDominance(*region);
521  for (auto [index, block] : llvm::enumerate(topologicalOrder))
522  blockIndices[block] = index;
523  return blockIndices;
524 }
525 
526 /// Sorts `ops` according to dominance. Relies on the topological order of basic
527 /// blocks to get a deterministic ordering. Uses `blockIndexCache` to avoid the
528 /// potentially expensive recomputation of a block index map.
530  BlockIndexCache &blockIndexCache) {
531  // Produce a topological block order and construct a map to lookup the indices
532  // of blocks.
533  const DenseMap<Block *, size_t> &topoBlockIndices =
534  getOrCreateBlockIndices(blockIndexCache, &region);
535 
536  // Combining the topological order of the basic blocks together with block
537  // internal operation order guarantees a deterministic, dominance respecting
538  // order.
539  llvm::sort(ops, [&](Operation *lhs, Operation *rhs) {
540  size_t lhsBlockIndex = topoBlockIndices.at(lhs->getBlock());
541  size_t rhsBlockIndex = topoBlockIndices.at(rhs->getBlock());
542  if (lhsBlockIndex == rhsBlockIndex)
543  return lhs->isBeforeInBlock(rhs);
544  return lhsBlockIndex < rhsBlockIndex;
545  });
546 }
547 
548 void MemorySlotPromoter::removeBlockingUses() {
549  llvm::SmallVector<Operation *> usersToRemoveUses(
550  llvm::make_first_range(info.userToBlockingUses));
551 
552  // Sort according to dominance.
553  dominanceSort(usersToRemoveUses, *slot.ptr.getParentBlock()->getParent(),
554  blockIndexCache);
555 
557  // List of all replaced values in the slot.
559  // Ops to visit with the `visitReplacedValues` method.
561  for (Operation *toPromote : llvm::reverse(usersToRemoveUses)) {
562  if (auto toPromoteMemOp = dyn_cast<PromotableMemOpInterface>(toPromote)) {
563  Value reachingDef = reachingDefs.lookup(toPromoteMemOp);
564  // If no reaching definition is known, this use is outside the reach of
565  // the slot. The default value should thus be used.
566  if (!reachingDef)
567  reachingDef = getOrCreateDefaultValue();
568 
569  builder.setInsertionPointAfter(toPromote);
570  if (toPromoteMemOp.removeBlockingUses(
571  slot, info.userToBlockingUses[toPromote], builder, reachingDef,
572  dataLayout) == DeletionKind::Delete)
573  toErase.push_back(toPromote);
574  if (toPromoteMemOp.storesTo(slot))
575  if (Value replacedValue = replacedValuesMap[toPromoteMemOp])
576  replacedValuesList.push_back({toPromoteMemOp, replacedValue});
577  continue;
578  }
579 
580  auto toPromoteBasic = cast<PromotableOpInterface>(toPromote);
581  builder.setInsertionPointAfter(toPromote);
582  if (toPromoteBasic.removeBlockingUses(info.userToBlockingUses[toPromote],
583  builder) == DeletionKind::Delete)
584  toErase.push_back(toPromote);
585  if (toPromoteBasic.requiresReplacedValues())
586  toVisit.push_back(toPromoteBasic);
587  }
588  for (PromotableOpInterface op : toVisit) {
589  builder.setInsertionPointAfter(op);
590  op.visitReplacedValues(replacedValuesList, builder);
591  }
592 
593  for (Operation *toEraseOp : toErase)
594  toEraseOp->erase();
595 
596  assert(slot.ptr.use_empty() &&
597  "after promotion, the slot pointer should not be used anymore");
598 }
599 
600 std::optional<PromotableAllocationOpInterface>
601 MemorySlotPromoter::promoteSlot() {
602  computeReachingDefInRegion(slot.ptr.getParentRegion(),
603  getOrCreateDefaultValue());
604 
605  // Now that reaching definitions are known, remove all users.
606  removeBlockingUses();
607 
608  // Update terminators in dead branches to forward default if they are
609  // succeeded by a merge points.
610  for (Block *mergePoint : info.mergePoints) {
611  for (BlockOperand &use : mergePoint->getUses()) {
612  auto user = cast<BranchOpInterface>(use.getOwner());
613  SuccessorOperands succOperands =
614  user.getSuccessorOperands(use.getOperandNumber());
615  assert(succOperands.size() == mergePoint->getNumArguments() ||
616  succOperands.size() + 1 == mergePoint->getNumArguments());
617  if (succOperands.size() + 1 == mergePoint->getNumArguments())
618  succOperands.append(getOrCreateDefaultValue());
619  }
620  }
621 
622  LLVM_DEBUG(llvm::dbgs() << "[mem2reg] Promoted memory slot: " << slot.ptr
623  << "\n");
624 
625  if (statistics.promotedAmount)
626  (*statistics.promotedAmount)++;
627 
628  return allocator.handlePromotionComplete(slot, defaultValue, builder);
629 }
630 
633  const DataLayout &dataLayout, DominanceInfo &dominance,
634  Mem2RegStatistics statistics) {
635  bool promotedAny = false;
636 
637  // A cache that stores deterministic block indices which are used to determine
638  // a valid operation modification order. The block index maps are computed
639  // lazily and cached to avoid expensive recomputation.
640  BlockIndexCache blockIndexCache;
641 
642  SmallVector<PromotableAllocationOpInterface> workList(allocators.begin(),
643  allocators.end());
644 
646  newWorkList.reserve(workList.size());
647  while (true) {
648  bool changesInThisRound = false;
649  for (PromotableAllocationOpInterface allocator : workList) {
650  bool changedAllocator = false;
651  for (MemorySlot slot : allocator.getPromotableSlots()) {
652  if (slot.ptr.use_empty())
653  continue;
654 
655  MemorySlotPromotionAnalyzer analyzer(slot, dominance, dataLayout);
656  std::optional<MemorySlotPromotionInfo> info = analyzer.computeInfo();
657  if (info) {
658  std::optional<PromotableAllocationOpInterface> newAllocator =
659  MemorySlotPromoter(slot, allocator, builder, dominance,
660  dataLayout, std::move(*info), statistics,
661  blockIndexCache)
662  .promoteSlot();
663  changedAllocator = true;
664  // Add newly created allocators to the worklist for further
665  // processing.
666  if (newAllocator)
667  newWorkList.push_back(*newAllocator);
668 
669  // A break is required, since promoting a slot may invalidate the
670  // remaining slots of an allocator.
671  break;
672  }
673  }
674  if (!changedAllocator)
675  newWorkList.push_back(allocator);
676  changesInThisRound |= changedAllocator;
677  }
678  if (!changesInThisRound)
679  break;
680  promotedAny = true;
681 
682  // Swap the vector's backing memory and clear the entries in newWorkList
683  // afterwards. This ensures that additional heap allocations can be avoided.
684  workList.swap(newWorkList);
685  newWorkList.clear();
686  }
687 
688  return success(promotedAny);
689 }
690 
691 namespace {
692 
693 struct Mem2Reg : impl::Mem2RegBase<Mem2Reg> {
694  using impl::Mem2RegBase<Mem2Reg>::Mem2RegBase;
695 
696  void runOnOperation() override {
697  Operation *scopeOp = getOperation();
698 
699  Mem2RegStatistics statistics{&promotedAmount, &newBlockArgumentAmount};
700 
701  bool changed = false;
702 
703  auto &dataLayoutAnalysis = getAnalysis<DataLayoutAnalysis>();
704  const DataLayout &dataLayout = dataLayoutAnalysis.getAtOrAbove(scopeOp);
705  auto &dominance = getAnalysis<DominanceInfo>();
706 
707  for (Region &region : scopeOp->getRegions()) {
708  if (region.getBlocks().empty())
709  continue;
710 
711  OpBuilder builder(&region.front(), region.front().begin());
712 
714  // Build a list of allocators to attempt to promote the slots of.
715  region.walk([&](PromotableAllocationOpInterface allocator) {
716  allocators.emplace_back(allocator);
717  });
718 
719  // Attempt promoting as many of the slots as possible.
720  if (succeeded(tryToPromoteMemorySlots(allocators, builder, dataLayout,
721  dominance, statistics)))
722  changed = true;
723  }
724  if (!changed)
725  markAllAnalysesPreserved();
726  }
727 };
728 
729 } // namespace
static void dominanceSort(SmallVector< Operation * > &ops, Region &region, BlockIndexCache &blockIndexCache)
Sorts ops according to dominance.
Definition: Mem2Reg.cpp:529
static const DenseMap< Block *, size_t > & getOrCreateBlockIndices(BlockIndexCache &blockIndexCache, Region *region)
Gets or creates a block index mapping for region.
Definition: Mem2Reg.cpp:514
llvm::IDFCalculatorBase< Block, false > IDFCalculator
Definition: Mem2Reg.cpp:376
This class represents an argument of a Block.
Definition: Value.h:319
A block operand represents an operand that holds a reference to a Block, e.g.
Definition: BlockSupport.h:30
Block represents an ordered list of Operations.
Definition: Block.h:31
unsigned getNumArguments()
Definition: Block.h:126
Region * getParent() const
Provide a 'getParent' method for ilist_node_with_parent methods.
Definition: Block.cpp:26
Operation * getTerminator()
Get the terminator operation of this block.
Definition: Block.cpp:243
iterator_range< pred_iterator > getPredecessors()
Definition: Block.h:235
BlockArgument addArgument(Type type, Location loc)
Add one value to the argument list.
Definition: Block.cpp:152
OpListType & getOperations()
Definition: Block.h:135
iterator begin()
Definition: Block.h:141
The main mechanism for performing data layout queries.
A class for computing basic dominance information.
Definition: Dominance.h:136
use_range getUses() const
Returns a range of all uses, which is useful for iterating over all uses.
Definition: UseDefLists.h:253
RAII guard to reset the insertion point of the builder when destroyed.
Definition: Builders.h:350
This class helps build Operations.
Definition: Builders.h:209
void setInsertionPointToStart(Block *block)
Sets the insertion point to the start of the specified block.
Definition: Builders.h:433
void setInsertionPointAfter(Operation *op)
Sets the insertion point to the node after the specified operation, which will cause subsequent inser...
Definition: Builders.h:414
This class represents an operand of an operation.
Definition: Value.h:267
Operation is the basic unit of execution within MLIR.
Definition: Operation.h:88
bool isBeforeInBlock(Operation *other)
Given an operation 'other' that is within the same parent block, return whether the current operation...
Definition: Operation.cpp:386
Block * getBlock()
Returns the operation block that contains this operation.
Definition: Operation.h:213
MutableArrayRef< Region > getRegions()
Returns the regions held by this operation.
Definition: Operation.h:672
void erase()
Remove this operation from its parent block and delete it.
Definition: Operation.cpp:539
This class contains a list of basic blocks and a link to the parent operation it is attached to.
Definition: Region.h:26
BlockListType & getBlocks()
Definition: Region.h:45
Block & front()
Definition: Region.h:65
bool hasOneBlock()
Return true if this region has exactly one block.
Definition: Region.h:68
RetT walk(FnT &&callback)
Walk all nested operations, blocks or regions (including this region), depending on the type of callb...
Definition: Region.h:285
This class models how operands are forwarded to block arguments in control flow.
void append(ValueRange valueRange)
Add new operands that are forwarded to the successor.
unsigned size() const
Returns the amount of operands passed to the successor.
This class represents an instance of an SSA value in the MLIR system, representing a computable value...
Definition: Value.h:96
bool use_empty() const
Returns true if this value has no uses.
Definition: Value.h:218
use_range getUses() const
Returns a range of all uses, which is useful for iterating over all uses.
Definition: Value.h:212
Block * getParentBlock()
Return the Block in which this Value is defined.
Definition: Value.cpp:48
user_range getUsers() const
Definition: Value.h:228
Location getLoc() const
Return the location of this value.
Definition: Value.cpp:26
Operation * getDefiningOp() const
If this value is the result of an operation, return the operation that defines it.
Definition: Value.cpp:20
Region * getParentRegion()
Return the Region in which this Value is defined.
Definition: Value.cpp:41
DomTree & getDomTree(Region *region) const
Definition: Dominance.h:101
constexpr void enumerate(std::tuple< Tys... > &tuple, CallbackT &&callback)
Definition: Matchers.h:285
Include the generated interface declarations.
LogicalResult failure(bool isFailure=true)
Utility function to generate a LogicalResult.
Definition: LogicalResult.h:62
SetVector< Block * > getBlocksSortedByDominance(Region &region)
Gets a list of blocks that is sorted according to dominance.
bool succeeded(LogicalResult result)
Utility function that returns true if the provided LogicalResult corresponds to a success value.
Definition: LogicalResult.h:68
LogicalResult success(bool isSuccess=true)
Utility function to generate a LogicalResult.
Definition: LogicalResult.h:56
LogicalResult tryToPromoteMemorySlots(ArrayRef< PromotableAllocationOpInterface > allocators, OpBuilder &builder, const DataLayout &dataLayout, DominanceInfo &dominance, Mem2RegStatistics statistics={})
Attempts to promote the memory slots of the provided allocators.
Definition: Mem2Reg.cpp:631
bool failed(LogicalResult result)
Utility function that returns true if the provided LogicalResult corresponds to a failure value.
Definition: LogicalResult.h:72
void getForwardSlice(Operation *op, SetVector< Operation * > *forwardSlice, const ForwardSliceOptions &options={})
Fills forwardSlice with the computed forward slice (i.e.
This class represents an efficient way to signal success or failure.
Definition: LogicalResult.h:26
Statistics collected while applying mem2reg.
Definition: Mem2Reg.h:18
llvm::Statistic * promotedAmount
Total amount of memory slots promoted.
Definition: Mem2Reg.h:20
llvm::Statistic * newBlockArgumentAmount
Total amount of new block arguments inserted in blocks.
Definition: Mem2Reg.h:22
Represents a slot in memory.
Value ptr
Pointer to the memory slot, used by operations to refer to it.
Type elemType
Type of the value contained in the slot.