MLIR  22.0.0git
Mem2Reg.cpp
Go to the documentation of this file.
1 //===- Mem2Reg.cpp - Promotes memory slots into values ----------*- C++ -*-===//
2 //
3 // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4 // See https://llvm.org/LICENSE.txt for license information.
5 // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6 //
7 //===----------------------------------------------------------------------===//
8 
13 #include "mlir/IR/Builders.h"
14 #include "mlir/IR/Dominance.h"
15 #include "mlir/IR/PatternMatch.h"
17 #include "mlir/IR/Value.h"
20 #include "mlir/Transforms/Passes.h"
21 #include "llvm/ADT/STLExtras.h"
22 #include "llvm/Support/DebugLog.h"
23 #include "llvm/Support/GenericIteratedDominanceFrontier.h"
24 
25 namespace mlir {
26 #define GEN_PASS_DEF_MEM2REG
27 #include "mlir/Transforms/Passes.h.inc"
28 } // namespace mlir
29 
30 #define DEBUG_TYPE "mem2reg"
31 
32 using namespace mlir;
33 
34 /// mem2reg
35 ///
36 /// This pass turns unnecessary uses of automatically allocated memory slots
37 /// into direct Value-based operations. For example, it will simplify storing a
38 /// constant in a memory slot to immediately load it to a direct use of that
39 /// constant. In other words, given a memory slot addressed by a non-aliased
40 /// "pointer" Value, mem2reg removes all the uses of that pointer.
41 ///
42 /// Within a block, this is done by following the chain of stores and loads of
43 /// the slot and replacing the results of loads with the values previously
44 /// stored. If a load happens before any other store, a poison value is used
45 /// instead.
46 ///
47 /// Control flow can create situations where a load could be replaced by
48 /// multiple possible stores depending on the control flow path taken. As a
49 /// result, this pass must introduce new block arguments in some blocks to
50 /// accommodate for the multiple possible definitions. Each predecessor will
51 /// populate the block argument with the definition reached at its end. With
52 /// this, the value stored can be well defined at block boundaries, allowing
53 /// the propagation of replacement through blocks.
54 ///
55 /// This pass computes this transformation in four main steps. The two first
56 /// steps are performed during an analysis phase that does not mutate IR.
57 ///
58 /// The two steps of the analysis phase are the following:
59 /// - A first step computes the list of operations that transitively use the
60 /// memory slot we would like to promote. The purpose of this phase is to
61 /// identify which uses must be removed to promote the slot, either by rewiring
62 /// the user or deleting it. Naturally, direct uses of the slot must be removed.
63 /// Sometimes additional uses must also be removed: this is notably the case
64 /// when a direct user of the slot cannot rewire its use and must delete itself,
65 /// and thus must make its users no longer use it. If any of those uses cannot
66 /// be removed by their users in any way, promotion cannot continue: this is
67 /// decided at this step.
68 /// - A second step computes the list of blocks where a block argument will be
69 /// needed ("merge points") without mutating the IR. These blocks are the blocks
70 /// leading to a definition clash between two predecessors. Such blocks happen
71 /// to be the Iterated Dominance Frontier (IDF) of the set of blocks containing
72 /// a store, as they represent the point where a clear defining dominator stops
73 /// existing. Computing this information in advance allows making sure the
74 /// terminators that will forward values are capable of doing so (inability to
75 /// do so aborts promotion at this step).
76 ///
77 /// At this point, promotion is guaranteed to happen, and the mutation phase can
78 /// begin with the following steps:
79 /// - A third step computes the reaching definition of the memory slot at each
80 /// blocking user. This is the core of the mem2reg algorithm, also known as
81 /// load-store forwarding. This analyses loads and stores and propagates which
82 /// value must be stored in the slot at each blocking user. This is achieved by
83 /// doing a depth-first walk of the dominator tree of the function. This is
84 /// sufficient because the reaching definition at the beginning of a block is
85 /// either its new block argument if it is a merge block, or the definition
86 /// reaching the end of its immediate dominator (parent in the dominator tree).
87 /// We can therefore propagate this information down the dominator tree to
88 /// proceed with renaming within blocks.
89 /// - The final fourth step uses the reaching definition to remove blocking uses
90 /// in topological order.
91 ///
92 /// For further reading, chapter three of SSA-based Compiler Design [1]
93 /// showcases SSA construction, where mem2reg is an adaptation of the same
94 /// process.
95 ///
96 /// [1]: Rastello F. & Bouchez Tichadou F., SSA-based Compiler Design (2022),
97 /// Springer.
98 
99 namespace {
100 
101 using BlockingUsesMap =
102  llvm::MapVector<Operation *, SmallPtrSet<OpOperand *, 4>>;
103 
104 /// Information computed during promotion analysis used to perform actual
105 /// promotion.
106 struct MemorySlotPromotionInfo {
107  /// Blocks for which at least two definitions of the slot values clash.
108  SmallPtrSet<Block *, 8> mergePoints;
109  /// Contains, for each operation, which uses must be eliminated by promotion.
110  /// This is a DAG structure because if an operation must eliminate some of
111  /// its uses, it is because the defining ops of the blocking uses requested
112  /// it. The defining ops therefore must also have blocking uses or be the
113  /// starting point of the blocking uses.
114  BlockingUsesMap userToBlockingUses;
115 };
116 
117 /// Computes information for basic slot promotion. This will check that direct
118 /// slot promotion can be performed, and provide the information to execute the
119 /// promotion. This does not mutate IR.
120 class MemorySlotPromotionAnalyzer {
121 public:
122  MemorySlotPromotionAnalyzer(MemorySlot slot, DominanceInfo &dominance,
123  const DataLayout &dataLayout)
124  : slot(slot), dominance(dominance), dataLayout(dataLayout) {}
125 
126  /// Computes the information for slot promotion if promotion is possible,
127  /// returns nothing otherwise.
128  std::optional<MemorySlotPromotionInfo> computeInfo();
129 
130 private:
131  /// Computes the transitive uses of the slot that block promotion. This finds
132  /// uses that would block the promotion, checks that the operation has a
133  /// solution to remove the blocking use, and potentially forwards the analysis
134  /// if the operation needs further blocking uses resolved to resolve its own
135  /// uses (typically, removing its users because it will delete itself to
136  /// resolve its own blocking uses). This will fail if one of the transitive
137  /// users cannot remove a requested use, and should prevent promotion.
138  LogicalResult computeBlockingUses(BlockingUsesMap &userToBlockingUses);
139 
140  /// Computes in which blocks the value stored in the slot is actually used,
141  /// meaning blocks leading to a load. This method uses `definingBlocks`, the
142  /// set of blocks containing a store to the slot (defining the value of the
143  /// slot).
145  computeSlotLiveIn(SmallPtrSetImpl<Block *> &definingBlocks);
146 
147  /// Computes the points in which multiple re-definitions of the slot's value
148  /// (stores) may conflict.
149  void computeMergePoints(SmallPtrSetImpl<Block *> &mergePoints);
150 
151  /// Ensures predecessors of merge points can properly provide their current
152  /// definition of the value stored in the slot to the merge point. This can
153  /// notably be an issue if the terminator used does not have the ability to
154  /// forward values through block operands.
155  bool areMergePointsUsable(SmallPtrSetImpl<Block *> &mergePoints);
156 
157  MemorySlot slot;
158  DominanceInfo &dominance;
159  const DataLayout &dataLayout;
160 };
161 
162 using BlockIndexCache = DenseMap<Region *, DenseMap<Block *, size_t>>;
163 
164 /// The MemorySlotPromoter handles the state of promoting a memory slot. It
165 /// wraps a slot and its associated allocator. This will perform the mutation of
166 /// IR.
167 class MemorySlotPromoter {
168 public:
169  MemorySlotPromoter(MemorySlot slot, PromotableAllocationOpInterface allocator,
170  OpBuilder &builder, DominanceInfo &dominance,
171  const DataLayout &dataLayout, MemorySlotPromotionInfo info,
172  const Mem2RegStatistics &statistics,
173  BlockIndexCache &blockIndexCache);
174 
175  /// Actually promotes the slot by mutating IR. Promoting a slot DOES
176  /// invalidate the MemorySlotPromotionInfo of other slots. Preparation of
177  /// promotion info should NOT be performed in batches.
178  /// Returns a promotable allocation op if a new allocator was created, nullopt
179  /// otherwise.
180  std::optional<PromotableAllocationOpInterface> promoteSlot();
181 
182 private:
183  /// Computes the reaching definition for all the operations that require
184  /// promotion. `reachingDef` is the value the slot should contain at the
185  /// beginning of the block. This method returns the reached definition at the
186  /// end of the block. This method must only be called at most once per block.
187  Value computeReachingDefInBlock(Block *block, Value reachingDef);
188 
189  /// Computes the reaching definition for all the operations that require
190  /// promotion. `reachingDef` corresponds to the initial value the
191  /// slot will contain before any write, typically a poison value.
192  /// This method must only be called at most once per region.
193  void computeReachingDefInRegion(Region *region, Value reachingDef);
194 
195  /// Removes the blocking uses of the slot, in topological order.
196  void removeBlockingUses();
197 
198  /// Lazily-constructed default value representing the content of the slot when
199  /// no store has been executed. This function may mutate IR.
200  Value getOrCreateDefaultValue();
201 
202  MemorySlot slot;
203  PromotableAllocationOpInterface allocator;
204  OpBuilder &builder;
205  /// Potentially non-initialized default value. Use `getOrCreateDefaultValue`
206  /// to initialize it on demand.
207  Value defaultValue;
208  /// Contains the reaching definition at this operation. Reaching definitions
209  /// are only computed for promotable memory operations with blocking uses.
212  DominanceInfo &dominance;
213  const DataLayout &dataLayout;
214  MemorySlotPromotionInfo info;
215  const Mem2RegStatistics &statistics;
216 
217  /// Shared cache of block indices of specific regions.
218  BlockIndexCache &blockIndexCache;
219 };
220 
221 } // namespace
222 
223 MemorySlotPromoter::MemorySlotPromoter(
224  MemorySlot slot, PromotableAllocationOpInterface allocator,
225  OpBuilder &builder, DominanceInfo &dominance, const DataLayout &dataLayout,
226  MemorySlotPromotionInfo info, const Mem2RegStatistics &statistics,
227  BlockIndexCache &blockIndexCache)
228  : slot(slot), allocator(allocator), builder(builder), dominance(dominance),
229  dataLayout(dataLayout), info(std::move(info)), statistics(statistics),
230  blockIndexCache(blockIndexCache) {
231 #ifndef NDEBUG
232  auto isResultOrNewBlockArgument = [&]() {
233  if (BlockArgument arg = dyn_cast<BlockArgument>(slot.ptr))
234  return arg.getOwner()->getParentOp() == allocator;
235  return slot.ptr.getDefiningOp() == allocator;
236  };
237 
238  assert(isResultOrNewBlockArgument() &&
239  "a slot must be a result of the allocator or an argument of the child "
240  "regions of the allocator");
241 #endif // NDEBUG
242 }
243 
244 Value MemorySlotPromoter::getOrCreateDefaultValue() {
245  if (defaultValue)
246  return defaultValue;
247 
248  OpBuilder::InsertionGuard guard(builder);
250  return defaultValue = allocator.getDefaultValue(slot, builder);
251 }
252 
253 LogicalResult MemorySlotPromotionAnalyzer::computeBlockingUses(
254  BlockingUsesMap &userToBlockingUses) {
255  // The promotion of an operation may require the promotion of further
256  // operations (typically, removing operations that use an operation that must
257  // delete itself). We thus need to start from the use of the slot pointer and
258  // propagate further requests through the forward slice.
259 
260  // Because this pass currently only supports analysing the parent region of
261  // the slot pointer, if a promotable memory op that needs promotion is within
262  // a graph region, the slot may only be used in a graph region and should
263  // therefore be ignored.
264  Region *slotPtrRegion = slot.ptr.getParentRegion();
265  auto slotPtrRegionOp =
266  dyn_cast<RegionKindInterface>(slotPtrRegion->getParentOp());
267  if (slotPtrRegionOp &&
268  slotPtrRegionOp.getRegionKind(slotPtrRegion->getRegionNumber()) ==
269  RegionKind::Graph)
270  return failure();
271 
272  // First insert that all immediate users of the slot pointer must no longer
273  // use it.
274  for (OpOperand &use : slot.ptr.getUses()) {
275  SmallPtrSet<OpOperand *, 4> &blockingUses =
276  userToBlockingUses[use.getOwner()];
277  blockingUses.insert(&use);
278  }
279 
280  // Then, propagate the requirements for the removal of uses. The
281  // topologically-sorted forward slice allows for all blocking uses of an
282  // operation to have been computed before it is reached. Operations are
283  // traversed in topological order of their uses, starting from the slot
284  // pointer.
285  SetVector<Operation *> forwardSlice;
286  mlir::getForwardSlice(slot.ptr, &forwardSlice);
287  for (Operation *user : forwardSlice) {
288  // If the next operation has no blocking uses, everything is fine.
289  auto it = userToBlockingUses.find(user);
290  if (it == userToBlockingUses.end())
291  continue;
292 
293  SmallPtrSet<OpOperand *, 4> &blockingUses = it->second;
294 
295  SmallVector<OpOperand *> newBlockingUses;
296  // If the operation decides it cannot deal with removing the blocking uses,
297  // promotion must fail.
298  if (auto promotable = dyn_cast<PromotableOpInterface>(user)) {
299  if (!promotable.canUsesBeRemoved(blockingUses, newBlockingUses,
300  dataLayout))
301  return failure();
302  } else if (auto promotable = dyn_cast<PromotableMemOpInterface>(user)) {
303  if (!promotable.canUsesBeRemoved(slot, blockingUses, newBlockingUses,
304  dataLayout))
305  return failure();
306  } else {
307  // An operation that has blocking uses must be promoted. If it is not
308  // promotable, promotion must fail.
309  return failure();
310  }
311 
312  // Then, register any new blocking uses for coming operations.
313  for (OpOperand *blockingUse : newBlockingUses) {
314  assert(llvm::is_contained(user->getResults(), blockingUse->get()));
315 
316  SmallPtrSetImpl<OpOperand *> &newUserBlockingUseSet =
317  userToBlockingUses[blockingUse->getOwner()];
318  newUserBlockingUseSet.insert(blockingUse);
319  }
320  }
321 
322  // Because this pass currently only supports analysing the parent region of
323  // the slot pointer, if a promotable memory op that needs promotion is outside
324  // of this region, promotion must fail because it will be impossible to
325  // provide a valid `reachingDef` for it.
326  for (auto &[toPromote, _] : userToBlockingUses)
327  if (isa<PromotableMemOpInterface>(toPromote) &&
328  toPromote->getParentRegion() != slot.ptr.getParentRegion())
329  return failure();
330 
331  return success();
332 }
333 
334 SmallPtrSet<Block *, 16> MemorySlotPromotionAnalyzer::computeSlotLiveIn(
335  SmallPtrSetImpl<Block *> &definingBlocks) {
337 
338  // The worklist contains blocks in which it is known that the slot value is
339  // live-in. The further blocks where this value is live-in will be inferred
340  // from these.
341  SmallVector<Block *> liveInWorkList;
342 
343  // Blocks with a load before any other store to the slot are the starting
344  // points of the analysis. The slot value is definitely live-in in those
345  // blocks.
346  SmallPtrSet<Block *, 16> visited;
347  for (Operation *user : slot.ptr.getUsers()) {
348  if (!visited.insert(user->getBlock()).second)
349  continue;
350 
351  for (Operation &op : user->getBlock()->getOperations()) {
352  if (auto memOp = dyn_cast<PromotableMemOpInterface>(op)) {
353  // If this operation loads the slot, it is loading from it before
354  // ever writing to it, so the value is live-in in this block.
355  if (memOp.loadsFrom(slot)) {
356  liveInWorkList.push_back(user->getBlock());
357  break;
358  }
359 
360  // If we store to the slot, further loads will see that value.
361  // Because we did not meet any load before, the value is not live-in.
362  if (memOp.storesTo(slot))
363  break;
364  }
365  }
366  }
367 
368  // The information is then propagated to the predecessors until a def site
369  // (store) is found.
370  while (!liveInWorkList.empty()) {
371  Block *liveInBlock = liveInWorkList.pop_back_val();
372 
373  if (!liveIn.insert(liveInBlock).second)
374  continue;
375 
376  // If a predecessor is a defining block, either:
377  // - It has a load before its first store, in which case it is live-in but
378  // has already been processed in the initialisation step.
379  // - It has a store before any load, in which case it is not live-in.
380  // We can thus at this stage insert to the worklist only predecessors that
381  // are not defining blocks.
382  for (Block *pred : liveInBlock->getPredecessors())
383  if (!definingBlocks.contains(pred))
384  liveInWorkList.push_back(pred);
385  }
386 
387  return liveIn;
388 }
389 
390 using IDFCalculator = llvm::IDFCalculatorBase<Block, false>;
391 void MemorySlotPromotionAnalyzer::computeMergePoints(
392  SmallPtrSetImpl<Block *> &mergePoints) {
393  if (slot.ptr.getParentRegion()->hasOneBlock())
394  return;
395 
396  IDFCalculator idfCalculator(dominance.getDomTree(slot.ptr.getParentRegion()));
397 
398  SmallPtrSet<Block *, 16> definingBlocks;
399  for (Operation *user : slot.ptr.getUsers())
400  if (auto storeOp = dyn_cast<PromotableMemOpInterface>(user))
401  if (storeOp.storesTo(slot))
402  definingBlocks.insert(user->getBlock());
403 
404  idfCalculator.setDefiningBlocks(definingBlocks);
405 
406  SmallPtrSet<Block *, 16> liveIn = computeSlotLiveIn(definingBlocks);
407  idfCalculator.setLiveInBlocks(liveIn);
408 
409  SmallVector<Block *> mergePointsVec;
410  idfCalculator.calculate(mergePointsVec);
411 
412  mergePoints.insert_range(mergePointsVec);
413 }
414 
415 bool MemorySlotPromotionAnalyzer::areMergePointsUsable(
416  SmallPtrSetImpl<Block *> &mergePoints) {
417  for (Block *mergePoint : mergePoints)
418  for (Block *pred : mergePoint->getPredecessors())
419  if (!isa<BranchOpInterface>(pred->getTerminator()))
420  return false;
421 
422  return true;
423 }
424 
425 std::optional<MemorySlotPromotionInfo>
426 MemorySlotPromotionAnalyzer::computeInfo() {
427  MemorySlotPromotionInfo info;
428 
429  // First, find the set of operations that will need to be changed for the
430  // promotion to happen. These operations need to resolve some of their uses,
431  // either by rewiring them or simply deleting themselves. If any of them
432  // cannot find a way to resolve their blocking uses, we abort the promotion.
433  if (failed(computeBlockingUses(info.userToBlockingUses)))
434  return {};
435 
436  // Then, compute blocks in which two or more definitions of the allocated
437  // variable may conflict. These blocks will need a new block argument to
438  // accommodate this.
439  computeMergePoints(info.mergePoints);
440 
441  // The slot can be promoted if the block arguments to be created can
442  // actually be populated with values, which may not be possible depending
443  // on their predecessors.
444  if (!areMergePointsUsable(info.mergePoints))
445  return {};
446 
447  return info;
448 }
449 
450 Value MemorySlotPromoter::computeReachingDefInBlock(Block *block,
451  Value reachingDef) {
452  SmallVector<Operation *> blockOps;
453  for (Operation &op : block->getOperations())
454  blockOps.push_back(&op);
455  for (Operation *op : blockOps) {
456  if (auto memOp = dyn_cast<PromotableMemOpInterface>(op)) {
457  if (info.userToBlockingUses.contains(memOp))
458  reachingDefs.insert({memOp, reachingDef});
459 
460  if (memOp.storesTo(slot)) {
461  builder.setInsertionPointAfter(memOp);
462  Value stored = memOp.getStored(slot, builder, reachingDef, dataLayout);
463  assert(stored && "a memory operation storing to a slot must provide a "
464  "new definition of the slot");
465  reachingDef = stored;
466  replacedValuesMap[memOp] = stored;
467  }
468  }
469  }
470 
471  return reachingDef;
472 }
473 
474 void MemorySlotPromoter::computeReachingDefInRegion(Region *region,
475  Value reachingDef) {
476  assert(reachingDef && "expected an initial reaching def to be provided");
477  if (region->hasOneBlock()) {
478  computeReachingDefInBlock(&region->front(), reachingDef);
479  return;
480  }
481 
482  struct DfsJob {
483  llvm::DomTreeNodeBase<Block> *block;
484  Value reachingDef;
485  };
486 
487  SmallVector<DfsJob> dfsStack;
488 
489  auto &domTree = dominance.getDomTree(slot.ptr.getParentRegion());
490 
491  dfsStack.emplace_back<DfsJob>(
492  {domTree.getNode(&region->front()), reachingDef});
493 
494  while (!dfsStack.empty()) {
495  DfsJob job = dfsStack.pop_back_val();
496  Block *block = job.block->getBlock();
497 
498  if (info.mergePoints.contains(block)) {
499  BlockArgument blockArgument =
500  block->addArgument(slot.elemType, slot.ptr.getLoc());
501  builder.setInsertionPointToStart(block);
502  allocator.handleBlockArgument(slot, blockArgument, builder);
503  job.reachingDef = blockArgument;
504 
505  if (statistics.newBlockArgumentAmount)
506  (*statistics.newBlockArgumentAmount)++;
507  }
508 
509  job.reachingDef = computeReachingDefInBlock(block, job.reachingDef);
510  assert(job.reachingDef);
511 
512  if (auto terminator = dyn_cast<BranchOpInterface>(block->getTerminator())) {
513  for (BlockOperand &blockOperand : terminator->getBlockOperands()) {
514  if (info.mergePoints.contains(blockOperand.get())) {
515  terminator.getSuccessorOperands(blockOperand.getOperandNumber())
516  .append(job.reachingDef);
517  }
518  }
519  }
520 
521  for (auto *child : job.block->children())
522  dfsStack.emplace_back<DfsJob>({child, job.reachingDef});
523  }
524 }
525 
526 /// Gets or creates a block index mapping for `region`.
527 static const DenseMap<Block *, size_t> &
528 getOrCreateBlockIndices(BlockIndexCache &blockIndexCache, Region *region) {
529  auto [it, inserted] = blockIndexCache.try_emplace(region);
530  if (!inserted)
531  return it->second;
532 
533  DenseMap<Block *, size_t> &blockIndices = it->second;
534  SetVector<Block *> topologicalOrder = getBlocksSortedByDominance(*region);
535  for (auto [index, block] : llvm::enumerate(topologicalOrder))
536  blockIndices[block] = index;
537  return blockIndices;
538 }
539 
540 /// Sorts `ops` according to dominance. Relies on the topological order of basic
541 /// blocks to get a deterministic ordering. Uses `blockIndexCache` to avoid the
542 /// potentially expensive recomputation of a block index map.
544  BlockIndexCache &blockIndexCache) {
545  // Produce a topological block order and construct a map to lookup the indices
546  // of blocks.
547  const DenseMap<Block *, size_t> &topoBlockIndices =
548  getOrCreateBlockIndices(blockIndexCache, &region);
549 
550  // Combining the topological order of the basic blocks together with block
551  // internal operation order guarantees a deterministic, dominance respecting
552  // order.
553  llvm::sort(ops, [&](Operation *lhs, Operation *rhs) {
554  size_t lhsBlockIndex = topoBlockIndices.at(lhs->getBlock());
555  size_t rhsBlockIndex = topoBlockIndices.at(rhs->getBlock());
556  if (lhsBlockIndex == rhsBlockIndex)
557  return lhs->isBeforeInBlock(rhs);
558  return lhsBlockIndex < rhsBlockIndex;
559  });
560 }
561 
562 void MemorySlotPromoter::removeBlockingUses() {
563  llvm::SmallVector<Operation *> usersToRemoveUses(
564  llvm::make_first_range(info.userToBlockingUses));
565 
566  // Sort according to dominance.
567  dominanceSort(usersToRemoveUses, *slot.ptr.getParentBlock()->getParent(),
568  blockIndexCache);
569 
571  // List of all replaced values in the slot.
573  // Ops to visit with the `visitReplacedValues` method.
575  for (Operation *toPromote : llvm::reverse(usersToRemoveUses)) {
576  if (auto toPromoteMemOp = dyn_cast<PromotableMemOpInterface>(toPromote)) {
577  Value reachingDef = reachingDefs.lookup(toPromoteMemOp);
578  // If no reaching definition is known, this use is outside the reach of
579  // the slot. The default value should thus be used.
580  if (!reachingDef)
581  reachingDef = getOrCreateDefaultValue();
582 
583  builder.setInsertionPointAfter(toPromote);
584  if (toPromoteMemOp.removeBlockingUses(
585  slot, info.userToBlockingUses[toPromote], builder, reachingDef,
586  dataLayout) == DeletionKind::Delete)
587  toErase.push_back(toPromote);
588  if (toPromoteMemOp.storesTo(slot))
589  if (Value replacedValue = replacedValuesMap[toPromoteMemOp])
590  replacedValuesList.push_back({toPromoteMemOp, replacedValue});
591  continue;
592  }
593 
594  auto toPromoteBasic = cast<PromotableOpInterface>(toPromote);
595  builder.setInsertionPointAfter(toPromote);
596  if (toPromoteBasic.removeBlockingUses(info.userToBlockingUses[toPromote],
597  builder) == DeletionKind::Delete)
598  toErase.push_back(toPromote);
599  if (toPromoteBasic.requiresReplacedValues())
600  toVisit.push_back(toPromoteBasic);
601  }
602  for (PromotableOpInterface op : toVisit) {
603  builder.setInsertionPointAfter(op);
604  op.visitReplacedValues(replacedValuesList, builder);
605  }
606 
607  for (Operation *toEraseOp : toErase)
608  toEraseOp->erase();
609 
610  assert(slot.ptr.use_empty() &&
611  "after promotion, the slot pointer should not be used anymore");
612 }
613 
614 std::optional<PromotableAllocationOpInterface>
615 MemorySlotPromoter::promoteSlot() {
616  computeReachingDefInRegion(slot.ptr.getParentRegion(),
617  getOrCreateDefaultValue());
618 
619  // Now that reaching definitions are known, remove all users.
620  removeBlockingUses();
621 
622  // Update terminators in dead branches to forward default if they are
623  // succeeded by a merge points.
624  for (Block *mergePoint : info.mergePoints) {
625  for (BlockOperand &use : mergePoint->getUses()) {
626  auto user = cast<BranchOpInterface>(use.getOwner());
627  SuccessorOperands succOperands =
628  user.getSuccessorOperands(use.getOperandNumber());
629  assert(succOperands.size() == mergePoint->getNumArguments() ||
630  succOperands.size() + 1 == mergePoint->getNumArguments());
631  if (succOperands.size() + 1 == mergePoint->getNumArguments())
632  succOperands.append(getOrCreateDefaultValue());
633  }
634  }
635 
636  LDBG() << "Promoted memory slot: " << slot.ptr;
637 
638  if (statistics.promotedAmount)
639  (*statistics.promotedAmount)++;
640 
641  return allocator.handlePromotionComplete(slot, defaultValue, builder);
642 }
643 
646  const DataLayout &dataLayout, DominanceInfo &dominance,
647  Mem2RegStatistics statistics) {
648  bool promotedAny = false;
649 
650  // A cache that stores deterministic block indices which are used to determine
651  // a valid operation modification order. The block index maps are computed
652  // lazily and cached to avoid expensive recomputation.
653  BlockIndexCache blockIndexCache;
654 
656 
658  newWorkList.reserve(workList.size());
659  while (true) {
660  bool changesInThisRound = false;
661  for (PromotableAllocationOpInterface allocator : workList) {
662  bool changedAllocator = false;
663  for (MemorySlot slot : allocator.getPromotableSlots()) {
664  if (slot.ptr.use_empty())
665  continue;
666 
667  MemorySlotPromotionAnalyzer analyzer(slot, dominance, dataLayout);
668  std::optional<MemorySlotPromotionInfo> info = analyzer.computeInfo();
669  if (info) {
670  std::optional<PromotableAllocationOpInterface> newAllocator =
671  MemorySlotPromoter(slot, allocator, builder, dominance,
672  dataLayout, std::move(*info), statistics,
673  blockIndexCache)
674  .promoteSlot();
675  changedAllocator = true;
676  // Add newly created allocators to the worklist for further
677  // processing.
678  if (newAllocator)
679  newWorkList.push_back(*newAllocator);
680 
681  // A break is required, since promoting a slot may invalidate the
682  // remaining slots of an allocator.
683  break;
684  }
685  }
686  if (!changedAllocator)
687  newWorkList.push_back(allocator);
688  changesInThisRound |= changedAllocator;
689  }
690  if (!changesInThisRound)
691  break;
692  promotedAny = true;
693 
694  // Swap the vector's backing memory and clear the entries in newWorkList
695  // afterwards. This ensures that additional heap allocations can be avoided.
696  workList.swap(newWorkList);
697  newWorkList.clear();
698  }
699 
700  return success(promotedAny);
701 }
702 
703 namespace {
704 
705 struct Mem2Reg : impl::Mem2RegBase<Mem2Reg> {
706  using impl::Mem2RegBase<Mem2Reg>::Mem2RegBase;
707 
708  void runOnOperation() override {
709  Operation *scopeOp = getOperation();
710 
711  Mem2RegStatistics statistics{&promotedAmount, &newBlockArgumentAmount};
712 
713  bool changed = false;
714 
715  auto &dataLayoutAnalysis = getAnalysis<DataLayoutAnalysis>();
716  const DataLayout &dataLayout = dataLayoutAnalysis.getAtOrAbove(scopeOp);
717  auto &dominance = getAnalysis<DominanceInfo>();
718 
719  for (Region &region : scopeOp->getRegions()) {
720  if (region.getBlocks().empty())
721  continue;
722 
723  OpBuilder builder(&region.front(), region.front().begin());
724 
726  // Build a list of allocators to attempt to promote the slots of.
727  region.walk([&](PromotableAllocationOpInterface allocator) {
728  allocators.emplace_back(allocator);
729  });
730 
731  // Attempt promoting as many of the slots as possible.
732  if (succeeded(tryToPromoteMemorySlots(allocators, builder, dataLayout,
733  dominance, statistics)))
734  changed = true;
735  }
736  if (!changed)
737  markAllAnalysesPreserved();
738  }
739 };
740 
741 } // namespace
static void dominanceSort(SmallVector< Operation * > &ops, Region &region, BlockIndexCache &blockIndexCache)
Sorts ops according to dominance.
Definition: Mem2Reg.cpp:543
static const DenseMap< Block *, size_t > & getOrCreateBlockIndices(BlockIndexCache &blockIndexCache, Region *region)
Gets or creates a block index mapping for region.
Definition: Mem2Reg.cpp:528
llvm::IDFCalculatorBase< Block, false > IDFCalculator
Definition: Mem2Reg.cpp:390
This class represents an argument of a Block.
Definition: Value.h:309
A block operand represents an operand that holds a reference to a Block, e.g.
Definition: BlockSupport.h:30
Block represents an ordered list of Operations.
Definition: Block.h:33
unsigned getNumArguments()
Definition: Block.h:128
Region * getParent() const
Provide a 'getParent' method for ilist_node_with_parent methods.
Definition: Block.cpp:27
Operation * getTerminator()
Get the terminator operation of this block.
Definition: Block.cpp:244
iterator_range< pred_iterator > getPredecessors()
Definition: Block.h:240
BlockArgument addArgument(Type type, Location loc)
Add one value to the argument list.
Definition: Block.cpp:153
OpListType & getOperations()
Definition: Block.h:137
iterator begin()
Definition: Block.h:143
The main mechanism for performing data layout queries.
A class for computing basic dominance information.
Definition: Dominance.h:140
use_range getUses() const
Returns a range of all uses, which is useful for iterating over all uses.
Definition: UseDefLists.h:253
RAII guard to reset the insertion point of the builder when destroyed.
Definition: Builders.h:348
This class helps build Operations.
Definition: Builders.h:207
void setInsertionPointToStart(Block *block)
Sets the insertion point to the start of the specified block.
Definition: Builders.h:431
void setInsertionPointAfter(Operation *op)
Sets the insertion point to the node after the specified operation, which will cause subsequent inser...
Definition: Builders.h:412
This class represents an operand of an operation.
Definition: Value.h:257
Operation is the basic unit of execution within MLIR.
Definition: Operation.h:88
bool isBeforeInBlock(Operation *other)
Given an operation 'other' that is within the same parent block, return whether the current operation...
Definition: Operation.cpp:385
Block * getBlock()
Returns the operation block that contains this operation.
Definition: Operation.h:213
MutableArrayRef< Region > getRegions()
Returns the regions held by this operation.
Definition: Operation.h:677
This class contains a list of basic blocks and a link to the parent operation it is attached to.
Definition: Region.h:26
unsigned getRegionNumber()
Return the number of this region in the parent operation.
Definition: Region.cpp:62
Operation * getParentOp()
Return the parent operation this region is attached to.
Definition: Region.h:200
BlockListType & getBlocks()
Definition: Region.h:45
Block & front()
Definition: Region.h:65
bool hasOneBlock()
Return true if this region has exactly one block.
Definition: Region.h:68
RetT walk(FnT &&callback)
Walk all nested operations, blocks or regions (including this region), depending on the type of callb...
Definition: Region.h:285
This class models how operands are forwarded to block arguments in control flow.
void append(ValueRange valueRange)
Add new operands that are forwarded to the successor.
unsigned size() const
Returns the amount of operands passed to the successor.
This class represents an instance of an SSA value in the MLIR system, representing a computable value...
Definition: Value.h:96
bool use_empty() const
Returns true if this value has no uses.
Definition: Value.h:208
use_range getUses() const
Returns a range of all uses, which is useful for iterating over all uses.
Definition: Value.h:188
Block * getParentBlock()
Return the Block in which this Value is defined.
Definition: Value.cpp:46
user_range getUsers() const
Definition: Value.h:218
Location getLoc() const
Return the location of this value.
Definition: Value.cpp:24
Operation * getDefiningOp() const
If this value is the result of an operation, return the operation that defines it.
Definition: Value.cpp:18
Region * getParentRegion()
Return the Region in which this Value is defined.
Definition: Value.cpp:39
DomTree & getDomTree(Region *region) const
Definition: Dominance.h:101
constexpr void enumerate(std::tuple< Tys... > &tuple, CallbackT &&callback)
Definition: Matchers.h:344
detail::InFlightRemark failed(Location loc, RemarkOpts opts)
Report an optimization remark that failed.
Definition: Remarks.h:491
Include the generated interface declarations.
const FrozenRewritePatternSet GreedyRewriteConfig bool * changed
SetVector< Block * > getBlocksSortedByDominance(Region &region)
Gets a list of blocks that is sorted according to dominance.
LogicalResult tryToPromoteMemorySlots(ArrayRef< PromotableAllocationOpInterface > allocators, OpBuilder &builder, const DataLayout &dataLayout, DominanceInfo &dominance, Mem2RegStatistics statistics={})
Attempts to promote the memory slots of the provided allocators.
Definition: Mem2Reg.cpp:644
void getForwardSlice(Operation *op, SetVector< Operation * > *forwardSlice, const ForwardSliceOptions &options={})
Fills forwardSlice with the computed forward slice (i.e.
Statistics collected while applying mem2reg.
Definition: Mem2Reg.h:18
llvm::Statistic * promotedAmount
Total amount of memory slots promoted.
Definition: Mem2Reg.h:20
llvm::Statistic * newBlockArgumentAmount
Total amount of new block arguments inserted in blocks.
Definition: Mem2Reg.h:22
Represents a slot in memory.
Value ptr
Pointer to the memory slot, used by operations to refer to it.
Type elemType
Type of the value contained in the slot.