MLIR 22.0.0git
Mem2Reg.cpp
Go to the documentation of this file.
1//===- Mem2Reg.cpp - Promotes memory slots into values ----------*- C++ -*-===//
2//
3// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4// See https://llvm.org/LICENSE.txt for license information.
5// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6//
7//===----------------------------------------------------------------------===//
8
13#include "mlir/IR/Builders.h"
14#include "mlir/IR/Dominance.h"
17#include "mlir/IR/Value.h"
21#include "llvm/ADT/STLExtras.h"
22#include "llvm/Support/DebugLog.h"
23#include "llvm/Support/GenericIteratedDominanceFrontier.h"
24
25namespace mlir {
26#define GEN_PASS_DEF_MEM2REG
27#include "mlir/Transforms/Passes.h.inc"
28} // namespace mlir
29
30#define DEBUG_TYPE "mem2reg"
31
32using namespace mlir;
33
34/// mem2reg
35///
36/// This pass turns unnecessary uses of automatically allocated memory slots
37/// into direct Value-based operations. For example, it will simplify storing a
38/// constant in a memory slot to immediately load it to a direct use of that
39/// constant. In other words, given a memory slot addressed by a non-aliased
40/// "pointer" Value, mem2reg removes all the uses of that pointer.
41///
42/// Within a block, this is done by following the chain of stores and loads of
43/// the slot and replacing the results of loads with the values previously
44/// stored. If a load happens before any other store, a poison value is used
45/// instead.
46///
47/// Control flow can create situations where a load could be replaced by
48/// multiple possible stores depending on the control flow path taken. As a
49/// result, this pass must introduce new block arguments in some blocks to
50/// accommodate for the multiple possible definitions. Each predecessor will
51/// populate the block argument with the definition reached at its end. With
52/// this, the value stored can be well defined at block boundaries, allowing
53/// the propagation of replacement through blocks.
54///
55/// This pass computes this transformation in four main steps. The two first
56/// steps are performed during an analysis phase that does not mutate IR.
57///
58/// The two steps of the analysis phase are the following:
59/// - A first step computes the list of operations that transitively use the
60/// memory slot we would like to promote. The purpose of this phase is to
61/// identify which uses must be removed to promote the slot, either by rewiring
62/// the user or deleting it. Naturally, direct uses of the slot must be removed.
63/// Sometimes additional uses must also be removed: this is notably the case
64/// when a direct user of the slot cannot rewire its use and must delete itself,
65/// and thus must make its users no longer use it. If any of those uses cannot
66/// be removed by their users in any way, promotion cannot continue: this is
67/// decided at this step.
68/// - A second step computes the list of blocks where a block argument will be
69/// needed ("merge points") without mutating the IR. These blocks are the blocks
70/// leading to a definition clash between two predecessors. Such blocks happen
71/// to be the Iterated Dominance Frontier (IDF) of the set of blocks containing
72/// a store, as they represent the point where a clear defining dominator stops
73/// existing. Computing this information in advance allows making sure the
74/// terminators that will forward values are capable of doing so (inability to
75/// do so aborts promotion at this step).
76///
77/// At this point, promotion is guaranteed to happen, and the mutation phase can
78/// begin with the following steps:
79/// - A third step computes the reaching definition of the memory slot at each
80/// blocking user. This is the core of the mem2reg algorithm, also known as
81/// load-store forwarding. This analyses loads and stores and propagates which
82/// value must be stored in the slot at each blocking user. This is achieved by
83/// doing a depth-first walk of the dominator tree of the function. This is
84/// sufficient because the reaching definition at the beginning of a block is
85/// either its new block argument if it is a merge block, or the definition
86/// reaching the end of its immediate dominator (parent in the dominator tree).
87/// We can therefore propagate this information down the dominator tree to
88/// proceed with renaming within blocks.
89/// - The final fourth step uses the reaching definition to remove blocking uses
90/// in topological order.
91///
92/// For further reading, chapter three of SSA-based Compiler Design [1]
93/// showcases SSA construction, where mem2reg is an adaptation of the same
94/// process.
95///
96/// [1]: Rastello F. & Bouchez Tichadou F., SSA-based Compiler Design (2022),
97/// Springer.
98
99namespace {
100
101using BlockingUsesMap =
102 llvm::MapVector<Operation *, SmallPtrSet<OpOperand *, 4>>;
103
104/// Information computed during promotion analysis used to perform actual
105/// promotion.
106struct MemorySlotPromotionInfo {
107 /// Blocks for which at least two definitions of the slot values clash.
108 SmallPtrSet<Block *, 8> mergePoints;
109 /// Contains, for each operation, which uses must be eliminated by promotion.
110 /// This is a DAG structure because if an operation must eliminate some of
111 /// its uses, it is because the defining ops of the blocking uses requested
112 /// it. The defining ops therefore must also have blocking uses or be the
113 /// starting point of the blocking uses.
114 BlockingUsesMap userToBlockingUses;
115};
116
117/// Computes information for basic slot promotion. This will check that direct
118/// slot promotion can be performed, and provide the information to execute the
119/// promotion. This does not mutate IR.
120class MemorySlotPromotionAnalyzer {
121public:
122 MemorySlotPromotionAnalyzer(MemorySlot slot, DominanceInfo &dominance,
123 const DataLayout &dataLayout)
124 : slot(slot), dominance(dominance), dataLayout(dataLayout) {}
125
126 /// Computes the information for slot promotion if promotion is possible,
127 /// returns nothing otherwise.
128 std::optional<MemorySlotPromotionInfo> computeInfo();
129
130private:
131 /// Computes the transitive uses of the slot that block promotion. This finds
132 /// uses that would block the promotion, checks that the operation has a
133 /// solution to remove the blocking use, and potentially forwards the analysis
134 /// if the operation needs further blocking uses resolved to resolve its own
135 /// uses (typically, removing its users because it will delete itself to
136 /// resolve its own blocking uses). This will fail if one of the transitive
137 /// users cannot remove a requested use, and should prevent promotion.
138 LogicalResult computeBlockingUses(BlockingUsesMap &userToBlockingUses);
139
140 /// Computes in which blocks the value stored in the slot is actually used,
141 /// meaning blocks leading to a load. This method uses `definingBlocks`, the
142 /// set of blocks containing a store to the slot (defining the value of the
143 /// slot).
145 computeSlotLiveIn(SmallPtrSetImpl<Block *> &definingBlocks);
146
147 /// Computes the points in which multiple re-definitions of the slot's value
148 /// (stores) may conflict.
149 void computeMergePoints(SmallPtrSetImpl<Block *> &mergePoints);
150
151 /// Ensures predecessors of merge points can properly provide their current
152 /// definition of the value stored in the slot to the merge point. This can
153 /// notably be an issue if the terminator used does not have the ability to
154 /// forward values through block operands.
155 bool areMergePointsUsable(SmallPtrSetImpl<Block *> &mergePoints);
156
157 MemorySlot slot;
158 DominanceInfo &dominance;
159 const DataLayout &dataLayout;
160};
161
162using BlockIndexCache = DenseMap<Region *, DenseMap<Block *, size_t>>;
163
164/// The MemorySlotPromoter handles the state of promoting a memory slot. It
165/// wraps a slot and its associated allocator. This will perform the mutation of
166/// IR.
167class MemorySlotPromoter {
168public:
169 MemorySlotPromoter(MemorySlot slot, PromotableAllocationOpInterface allocator,
170 OpBuilder &builder, DominanceInfo &dominance,
171 const DataLayout &dataLayout, MemorySlotPromotionInfo info,
172 const Mem2RegStatistics &statistics,
173 BlockIndexCache &blockIndexCache);
174
175 /// Actually promotes the slot by mutating IR. Promoting a slot DOES
176 /// invalidate the MemorySlotPromotionInfo of other slots. Preparation of
177 /// promotion info should NOT be performed in batches.
178 /// Returns a promotable allocation op if a new allocator was created, nullopt
179 /// otherwise.
180 std::optional<PromotableAllocationOpInterface> promoteSlot();
181
182private:
183 /// Computes the reaching definition for all the operations that require
184 /// promotion. `reachingDef` is the value the slot should contain at the
185 /// beginning of the block. This method returns the reached definition at the
186 /// end of the block. This method must only be called at most once per block.
187 Value computeReachingDefInBlock(Block *block, Value reachingDef);
188
189 /// Computes the reaching definition for all the operations that require
190 /// promotion. `reachingDef` corresponds to the initial value the
191 /// slot will contain before any write, typically a poison value.
192 /// This method must only be called at most once per region.
193 void computeReachingDefInRegion(Region *region, Value reachingDef);
194
195 /// Removes the blocking uses of the slot, in topological order.
196 void removeBlockingUses();
197
198 /// Lazily-constructed default value representing the content of the slot when
199 /// no store has been executed. This function may mutate IR.
200 Value getOrCreateDefaultValue();
201
202 MemorySlot slot;
203 PromotableAllocationOpInterface allocator;
204 OpBuilder &builder;
205 /// Potentially non-initialized default value. Use `getOrCreateDefaultValue`
206 /// to initialize it on demand.
207 Value defaultValue;
208 /// Contains the reaching definition at this operation. Reaching definitions
209 /// are only computed for promotable memory operations with blocking uses.
212 DominanceInfo &dominance;
213 const DataLayout &dataLayout;
214 MemorySlotPromotionInfo info;
215 const Mem2RegStatistics &statistics;
216
217 /// Shared cache of block indices of specific regions.
218 BlockIndexCache &blockIndexCache;
219};
220
221} // namespace
222
223MemorySlotPromoter::MemorySlotPromoter(
224 MemorySlot slot, PromotableAllocationOpInterface allocator,
225 OpBuilder &builder, DominanceInfo &dominance, const DataLayout &dataLayout,
226 MemorySlotPromotionInfo info, const Mem2RegStatistics &statistics,
227 BlockIndexCache &blockIndexCache)
228 : slot(slot), allocator(allocator), builder(builder), dominance(dominance),
229 dataLayout(dataLayout), info(std::move(info)), statistics(statistics),
230 blockIndexCache(blockIndexCache) {
231#ifndef NDEBUG
232 auto isResultOrNewBlockArgument = [&]() {
233 if (BlockArgument arg = dyn_cast<BlockArgument>(slot.ptr))
234 return arg.getOwner()->getParentOp() == allocator;
235 return slot.ptr.getDefiningOp() == allocator;
236 };
237
238 assert(isResultOrNewBlockArgument() &&
239 "a slot must be a result of the allocator or an argument of the child "
240 "regions of the allocator");
241#endif // NDEBUG
242}
243
244Value MemorySlotPromoter::getOrCreateDefaultValue() {
245 if (defaultValue)
246 return defaultValue;
247
248 OpBuilder::InsertionGuard guard(builder);
250 return defaultValue = allocator.getDefaultValue(slot, builder);
251}
252
253LogicalResult MemorySlotPromotionAnalyzer::computeBlockingUses(
254 BlockingUsesMap &userToBlockingUses) {
255 // The promotion of an operation may require the promotion of further
256 // operations (typically, removing operations that use an operation that must
257 // delete itself). We thus need to start from the use of the slot pointer and
258 // propagate further requests through the forward slice.
259
260 // Because this pass currently only supports analysing the parent region of
261 // the slot pointer, if a promotable memory op that needs promotion is within
262 // a graph region, the slot may only be used in a graph region and should
263 // therefore be ignored.
264 Region *slotPtrRegion = slot.ptr.getParentRegion();
265 auto slotPtrRegionOp =
266 dyn_cast<RegionKindInterface>(slotPtrRegion->getParentOp());
267 if (slotPtrRegionOp &&
268 slotPtrRegionOp.getRegionKind(slotPtrRegion->getRegionNumber()) ==
269 RegionKind::Graph)
270 return failure();
271
272 // First insert that all immediate users of the slot pointer must no longer
273 // use it.
274 for (OpOperand &use : slot.ptr.getUses()) {
275 SmallPtrSet<OpOperand *, 4> &blockingUses =
276 userToBlockingUses[use.getOwner()];
277 blockingUses.insert(&use);
278 }
279
280 // Then, propagate the requirements for the removal of uses. The
281 // topologically-sorted forward slice allows for all blocking uses of an
282 // operation to have been computed before it is reached. Operations are
283 // traversed in topological order of their uses, starting from the slot
284 // pointer.
285 SetVector<Operation *> forwardSlice;
286 mlir::getForwardSlice(slot.ptr, &forwardSlice);
287 for (Operation *user : forwardSlice) {
288 // If the next operation has no blocking uses, everything is fine.
289 auto *it = userToBlockingUses.find(user);
290 if (it == userToBlockingUses.end())
291 continue;
292
293 SmallPtrSet<OpOperand *, 4> &blockingUses = it->second;
294
295 SmallVector<OpOperand *> newBlockingUses;
296 // If the operation decides it cannot deal with removing the blocking uses,
297 // promotion must fail.
298 if (auto promotable = dyn_cast<PromotableOpInterface>(user)) {
299 if (!promotable.canUsesBeRemoved(blockingUses, newBlockingUses,
300 dataLayout))
301 return failure();
302 } else if (auto promotable = dyn_cast<PromotableMemOpInterface>(user)) {
303 if (!promotable.canUsesBeRemoved(slot, blockingUses, newBlockingUses,
304 dataLayout))
305 return failure();
306 } else {
307 // An operation that has blocking uses must be promoted. If it is not
308 // promotable, promotion must fail.
309 return failure();
310 }
311
312 // Then, register any new blocking uses for coming operations.
313 for (OpOperand *blockingUse : newBlockingUses) {
314 assert(llvm::is_contained(user->getResults(), blockingUse->get()));
315
316 SmallPtrSetImpl<OpOperand *> &newUserBlockingUseSet =
317 userToBlockingUses[blockingUse->getOwner()];
318 newUserBlockingUseSet.insert(blockingUse);
319 }
320 }
321
322 // Because this pass currently only supports analysing the parent region of
323 // the slot pointer, if a promotable memory op that needs promotion is outside
324 // of this region, promotion must fail because it will be impossible to
325 // provide a valid `reachingDef` for it.
326 for (auto &[toPromote, _] : userToBlockingUses)
327 if (isa<PromotableMemOpInterface>(toPromote) &&
328 toPromote->getParentRegion() != slot.ptr.getParentRegion())
329 return failure();
330
331 return success();
332}
333
334SmallPtrSet<Block *, 16> MemorySlotPromotionAnalyzer::computeSlotLiveIn(
335 SmallPtrSetImpl<Block *> &definingBlocks) {
336 SmallPtrSet<Block *, 16> liveIn;
337
338 // The worklist contains blocks in which it is known that the slot value is
339 // live-in. The further blocks where this value is live-in will be inferred
340 // from these.
341 SmallVector<Block *> liveInWorkList;
342
343 // Blocks with a load before any other store to the slot are the starting
344 // points of the analysis. The slot value is definitely live-in in those
345 // blocks.
346 SmallPtrSet<Block *, 16> visited;
347 for (Operation *user : slot.ptr.getUsers()) {
348 if (!visited.insert(user->getBlock()).second)
349 continue;
350
351 for (Operation &op : user->getBlock()->getOperations()) {
352 if (auto memOp = dyn_cast<PromotableMemOpInterface>(op)) {
353 // If this operation loads the slot, it is loading from it before
354 // ever writing to it, so the value is live-in in this block.
355 if (memOp.loadsFrom(slot)) {
356 liveInWorkList.push_back(user->getBlock());
357 break;
358 }
359
360 // If we store to the slot, further loads will see that value.
361 // Because we did not meet any load before, the value is not live-in.
362 if (memOp.storesTo(slot))
363 break;
364 }
365 }
366 }
367
368 // The information is then propagated to the predecessors until a def site
369 // (store) is found.
370 while (!liveInWorkList.empty()) {
371 Block *liveInBlock = liveInWorkList.pop_back_val();
372
373 if (!liveIn.insert(liveInBlock).second)
374 continue;
375
376 // If a predecessor is a defining block, either:
377 // - It has a load before its first store, in which case it is live-in but
378 // has already been processed in the initialisation step.
379 // - It has a store before any load, in which case it is not live-in.
380 // We can thus at this stage insert to the worklist only predecessors that
381 // are not defining blocks.
382 for (Block *pred : liveInBlock->getPredecessors())
383 if (!definingBlocks.contains(pred))
384 liveInWorkList.push_back(pred);
385 }
386
387 return liveIn;
388}
389
390using IDFCalculator = llvm::IDFCalculatorBase<Block, false>;
391void MemorySlotPromotionAnalyzer::computeMergePoints(
392 SmallPtrSetImpl<Block *> &mergePoints) {
393 if (slot.ptr.getParentRegion()->hasOneBlock())
394 return;
395
396 IDFCalculator idfCalculator(dominance.getDomTree(slot.ptr.getParentRegion()));
397
398 SmallPtrSet<Block *, 16> definingBlocks;
399 for (Operation *user : slot.ptr.getUsers())
400 if (auto storeOp = dyn_cast<PromotableMemOpInterface>(user))
401 if (storeOp.storesTo(slot))
402 definingBlocks.insert(user->getBlock());
403
404 idfCalculator.setDefiningBlocks(definingBlocks);
405
406 SmallPtrSet<Block *, 16> liveIn = computeSlotLiveIn(definingBlocks);
407 idfCalculator.setLiveInBlocks(liveIn);
408
409 SmallVector<Block *> mergePointsVec;
410 idfCalculator.calculate(mergePointsVec);
411
412 mergePoints.insert_range(mergePointsVec);
413}
414
415bool MemorySlotPromotionAnalyzer::areMergePointsUsable(
416 SmallPtrSetImpl<Block *> &mergePoints) {
417 for (Block *mergePoint : mergePoints)
418 for (Block *pred : mergePoint->getPredecessors())
419 if (!isa<BranchOpInterface>(pred->getTerminator()))
420 return false;
421
422 return true;
423}
424
425std::optional<MemorySlotPromotionInfo>
426MemorySlotPromotionAnalyzer::computeInfo() {
427 MemorySlotPromotionInfo info;
428
429 // First, find the set of operations that will need to be changed for the
430 // promotion to happen. These operations need to resolve some of their uses,
431 // either by rewiring them or simply deleting themselves. If any of them
432 // cannot find a way to resolve their blocking uses, we abort the promotion.
433 if (failed(computeBlockingUses(info.userToBlockingUses)))
434 return {};
435
436 // Then, compute blocks in which two or more definitions of the allocated
437 // variable may conflict. These blocks will need a new block argument to
438 // accommodate this.
439 computeMergePoints(info.mergePoints);
440
441 // The slot can be promoted if the block arguments to be created can
442 // actually be populated with values, which may not be possible depending
443 // on their predecessors.
444 if (!areMergePointsUsable(info.mergePoints))
445 return {};
446
447 return info;
448}
449
450Value MemorySlotPromoter::computeReachingDefInBlock(Block *block,
451 Value reachingDef) {
452 SmallVector<Operation *> blockOps;
453 for (Operation &op : block->getOperations())
454 blockOps.push_back(&op);
455 for (Operation *op : blockOps) {
456 if (auto memOp = dyn_cast<PromotableMemOpInterface>(op)) {
457 if (info.userToBlockingUses.contains(memOp))
458 reachingDefs.insert({memOp, reachingDef});
459
460 if (memOp.storesTo(slot)) {
461 builder.setInsertionPointAfter(memOp);
462 Value stored = memOp.getStored(slot, builder, reachingDef, dataLayout);
463 assert(stored && "a memory operation storing to a slot must provide a "
464 "new definition of the slot");
465 reachingDef = stored;
466 replacedValuesMap[memOp] = stored;
467 }
468 }
469 }
470
471 return reachingDef;
472}
473
474void MemorySlotPromoter::computeReachingDefInRegion(Region *region,
475 Value reachingDef) {
476 assert(reachingDef && "expected an initial reaching def to be provided");
477 if (region->hasOneBlock()) {
478 computeReachingDefInBlock(&region->front(), reachingDef);
479 return;
480 }
481
482 struct DfsJob {
483 llvm::DomTreeNodeBase<Block> *block;
484 Value reachingDef;
485 };
486
487 SmallVector<DfsJob> dfsStack;
488
489 auto &domTree = dominance.getDomTree(slot.ptr.getParentRegion());
490
491 dfsStack.emplace_back<DfsJob>(
492 {domTree.getNode(&region->front()), reachingDef});
493
494 while (!dfsStack.empty()) {
495 DfsJob job = dfsStack.pop_back_val();
496 Block *block = job.block->getBlock();
497
498 if (info.mergePoints.contains(block)) {
499 BlockArgument blockArgument =
500 block->addArgument(slot.elemType, slot.ptr.getLoc());
501 builder.setInsertionPointToStart(block);
502 allocator.handleBlockArgument(slot, blockArgument, builder);
503 job.reachingDef = blockArgument;
504
505 if (statistics.newBlockArgumentAmount)
506 (*statistics.newBlockArgumentAmount)++;
507 }
508
509 job.reachingDef = computeReachingDefInBlock(block, job.reachingDef);
510 assert(job.reachingDef);
511
512 if (auto terminator = dyn_cast<BranchOpInterface>(block->getTerminator())) {
513 for (BlockOperand &blockOperand : terminator->getBlockOperands()) {
514 if (info.mergePoints.contains(blockOperand.get())) {
515 terminator.getSuccessorOperands(blockOperand.getOperandNumber())
516 .append(job.reachingDef);
517 }
518 }
519 }
520
521 for (auto *child : job.block->children())
522 dfsStack.emplace_back<DfsJob>({child, job.reachingDef});
523 }
524}
525
526/// Gets or creates a block index mapping for `region`.
527static const DenseMap<Block *, size_t> &
528getOrCreateBlockIndices(BlockIndexCache &blockIndexCache, Region *region) {
529 auto [it, inserted] = blockIndexCache.try_emplace(region);
530 if (!inserted)
531 return it->second;
532
533 DenseMap<Block *, size_t> &blockIndices = it->second;
534 SetVector<Block *> topologicalOrder = getBlocksSortedByDominance(*region);
535 for (auto [index, block] : llvm::enumerate(topologicalOrder))
536 blockIndices[block] = index;
537 return blockIndices;
538}
539
540/// Sorts `ops` according to dominance. Relies on the topological order of basic
541/// blocks to get a deterministic ordering. Uses `blockIndexCache` to avoid the
542/// potentially expensive recomputation of a block index map.
544 BlockIndexCache &blockIndexCache) {
545 // Produce a topological block order and construct a map to lookup the indices
546 // of blocks.
547 const DenseMap<Block *, size_t> &topoBlockIndices =
548 getOrCreateBlockIndices(blockIndexCache, &region);
549
550 // Combining the topological order of the basic blocks together with block
551 // internal operation order guarantees a deterministic, dominance respecting
552 // order.
553 llvm::sort(ops, [&](Operation *lhs, Operation *rhs) {
554 size_t lhsBlockIndex = topoBlockIndices.at(lhs->getBlock());
555 size_t rhsBlockIndex = topoBlockIndices.at(rhs->getBlock());
556 if (lhsBlockIndex == rhsBlockIndex)
557 return lhs->isBeforeInBlock(rhs);
558 return lhsBlockIndex < rhsBlockIndex;
559 });
560}
561
562void MemorySlotPromoter::removeBlockingUses() {
563 llvm::SmallVector<Operation *> usersToRemoveUses(
564 llvm::make_first_range(info.userToBlockingUses));
565
566 // Sort according to dominance.
567 dominanceSort(usersToRemoveUses, *slot.ptr.getParentBlock()->getParent(),
568 blockIndexCache);
569
570 llvm::SmallVector<Operation *> toErase;
571 // List of all replaced values in the slot.
572 llvm::SmallVector<std::pair<Operation *, Value>> replacedValuesList;
573 // Ops to visit with the `visitReplacedValues` method.
574 llvm::SmallVector<PromotableOpInterface> toVisit;
575 for (Operation *toPromote : llvm::reverse(usersToRemoveUses)) {
576 if (auto toPromoteMemOp = dyn_cast<PromotableMemOpInterface>(toPromote)) {
577 Value reachingDef = reachingDefs.lookup(toPromoteMemOp);
578 // If no reaching definition is known, this use is outside the reach of
579 // the slot. The default value should thus be used.
580 if (!reachingDef)
581 reachingDef = getOrCreateDefaultValue();
582
583 builder.setInsertionPointAfter(toPromote);
584 if (toPromoteMemOp.removeBlockingUses(
585 slot, info.userToBlockingUses[toPromote], builder, reachingDef,
586 dataLayout) == DeletionKind::Delete)
587 toErase.push_back(toPromote);
588 if (toPromoteMemOp.storesTo(slot))
589 if (Value replacedValue = replacedValuesMap[toPromoteMemOp])
590 replacedValuesList.push_back({toPromoteMemOp, replacedValue});
591 continue;
592 }
593
594 auto toPromoteBasic = cast<PromotableOpInterface>(toPromote);
595 builder.setInsertionPointAfter(toPromote);
596 if (toPromoteBasic.removeBlockingUses(info.userToBlockingUses[toPromote],
597 builder) == DeletionKind::Delete)
598 toErase.push_back(toPromote);
599 if (toPromoteBasic.requiresReplacedValues())
600 toVisit.push_back(toPromoteBasic);
601 }
602 for (PromotableOpInterface op : toVisit) {
603 builder.setInsertionPointAfter(op);
604 op.visitReplacedValues(replacedValuesList, builder);
605 }
606
607 for (Operation *toEraseOp : toErase)
608 toEraseOp->erase();
609
610 assert(slot.ptr.use_empty() &&
611 "after promotion, the slot pointer should not be used anymore");
612}
613
614std::optional<PromotableAllocationOpInterface>
615MemorySlotPromoter::promoteSlot() {
616 computeReachingDefInRegion(slot.ptr.getParentRegion(),
617 getOrCreateDefaultValue());
618
619 // Now that reaching definitions are known, remove all users.
620 removeBlockingUses();
621
622 // Update terminators in dead branches to forward default if they are
623 // succeeded by a merge points.
624 for (Block *mergePoint : info.mergePoints) {
625 for (BlockOperand &use : mergePoint->getUses()) {
626 auto user = cast<BranchOpInterface>(use.getOwner());
627 SuccessorOperands succOperands =
628 user.getSuccessorOperands(use.getOperandNumber());
629 assert(succOperands.size() == mergePoint->getNumArguments() ||
630 succOperands.size() + 1 == mergePoint->getNumArguments());
631 if (succOperands.size() + 1 == mergePoint->getNumArguments())
632 succOperands.append(getOrCreateDefaultValue());
633 }
634 }
635
636 LDBG() << "Promoted memory slot: " << slot.ptr;
637
638 if (statistics.promotedAmount)
639 (*statistics.promotedAmount)++;
640
641 return allocator.handlePromotionComplete(slot, defaultValue, builder);
642}
643
646 const DataLayout &dataLayout, DominanceInfo &dominance,
647 Mem2RegStatistics statistics) {
648 bool promotedAny = false;
649
650 // A cache that stores deterministic block indices which are used to determine
651 // a valid operation modification order. The block index maps are computed
652 // lazily and cached to avoid expensive recomputation.
653 BlockIndexCache blockIndexCache;
654
656
658 newWorkList.reserve(workList.size());
659 while (true) {
660 bool changesInThisRound = false;
661 for (PromotableAllocationOpInterface allocator : workList) {
662 bool changedAllocator = false;
663 for (MemorySlot slot : allocator.getPromotableSlots()) {
664 if (slot.ptr.use_empty())
665 continue;
666
667 MemorySlotPromotionAnalyzer analyzer(slot, dominance, dataLayout);
668 std::optional<MemorySlotPromotionInfo> info = analyzer.computeInfo();
669 if (info) {
670 std::optional<PromotableAllocationOpInterface> newAllocator =
671 MemorySlotPromoter(slot, allocator, builder, dominance,
672 dataLayout, std::move(*info), statistics,
673 blockIndexCache)
674 .promoteSlot();
675 changedAllocator = true;
676 // Add newly created allocators to the worklist for further
677 // processing.
678 if (newAllocator)
679 newWorkList.push_back(*newAllocator);
680
681 // A break is required, since promoting a slot may invalidate the
682 // remaining slots of an allocator.
683 break;
684 }
685 }
686 if (!changedAllocator)
687 newWorkList.push_back(allocator);
688 changesInThisRound |= changedAllocator;
689 }
690 if (!changesInThisRound)
691 break;
692 promotedAny = true;
693
694 // Swap the vector's backing memory and clear the entries in newWorkList
695 // afterwards. This ensures that additional heap allocations can be avoided.
696 workList.swap(newWorkList);
697 newWorkList.clear();
698 }
699
700 return success(promotedAny);
701}
702
703namespace {
704
705struct Mem2Reg : impl::Mem2RegBase<Mem2Reg> {
706 using impl::Mem2RegBase<Mem2Reg>::Mem2RegBase;
707
708 void runOnOperation() override {
709 Operation *scopeOp = getOperation();
710
711 Mem2RegStatistics statistics{&promotedAmount, &newBlockArgumentAmount};
712
713 bool changed = false;
714
715 auto &dataLayoutAnalysis = getAnalysis<DataLayoutAnalysis>();
716 const DataLayout &dataLayout = dataLayoutAnalysis.getAtOrAbove(scopeOp);
717 auto &dominance = getAnalysis<DominanceInfo>();
718
719 for (Region &region : scopeOp->getRegions()) {
720 if (region.getBlocks().empty())
721 continue;
722
723 OpBuilder builder(&region.front(), region.front().begin());
724
725 SmallVector<PromotableAllocationOpInterface> allocators;
726 // Build a list of allocators to attempt to promote the slots of.
727 region.walk([&](PromotableAllocationOpInterface allocator) {
728 allocators.emplace_back(allocator);
729 });
730
731 // Attempt promoting as many of the slots as possible.
732 if (succeeded(tryToPromoteMemorySlots(allocators, builder, dataLayout,
733 dominance, statistics)))
734 changed = true;
735 }
736 if (!changed)
737 markAllAnalysesPreserved();
738 }
739};
740
741} // namespace
return success()
lhs
*if copies could not be generated due to yet unimplemented cases *copyInPlacementStart and copyOutPlacementStart in copyPlacementBlock *specify the insertion points where the incoming copies and outgoing should be inserted(the insertion happens right before the *insertion point). Since `begin` can itself be invalidated due to the memref *rewriting done from this method
static void dominanceSort(SmallVector< Operation * > &ops, Region &region, BlockIndexCache &blockIndexCache)
Sorts ops according to dominance.
Definition Mem2Reg.cpp:543
static const DenseMap< Block *, size_t > & getOrCreateBlockIndices(BlockIndexCache &blockIndexCache, Region *region)
Gets or creates a block index mapping for region.
Definition Mem2Reg.cpp:528
llvm::IDFCalculatorBase< Block, false > IDFCalculator
Definition Mem2Reg.cpp:390
This class represents an argument of a Block.
Definition Value.h:309
Block represents an ordered list of Operations.
Definition Block.h:33
unsigned getNumArguments()
Definition Block.h:128
iterator_range< pred_iterator > getPredecessors()
Definition Block.h:240
Region * getParent() const
Provide a 'getParent' method for ilist_node_with_parent methods.
Definition Block.cpp:27
OpListType & getOperations()
Definition Block.h:137
Operation * getTerminator()
Get the terminator operation of this block.
Definition Block.cpp:244
BlockArgument addArgument(Type type, Location loc)
Add one value to the argument list.
Definition Block.cpp:153
iterator begin()
Definition Block.h:143
The main mechanism for performing data layout queries.
A class for computing basic dominance information.
Definition Dominance.h:140
use_range getUses() const
Returns a range of all uses, which is useful for iterating over all uses.
This class helps build Operations.
Definition Builders.h:207
void setInsertionPointToStart(Block *block)
Sets the insertion point to the start of the specified block.
Definition Builders.h:431
void setInsertionPointAfter(Operation *op)
Sets the insertion point to the node after the specified operation, which will cause subsequent inser...
Definition Builders.h:412
Operation is the basic unit of execution within MLIR.
Definition Operation.h:88
MutableArrayRef< Region > getRegions()
Returns the regions held by this operation.
Definition Operation.h:677
This class contains a list of basic blocks and a link to the parent operation it is attached to.
Definition Region.h:26
Block & front()
Definition Region.h:65
unsigned getRegionNumber()
Return the number of this region in the parent operation.
Definition Region.cpp:62
Operation * getParentOp()
Return the parent operation this region is attached to.
Definition Region.h:200
BlockListType & getBlocks()
Definition Region.h:45
bool hasOneBlock()
Return true if this region has exactly one block.
Definition Region.h:68
RetT walk(FnT &&callback)
Walk all nested operations, blocks or regions (including this region), depending on the type of callb...
Definition Region.h:285
void append(ValueRange valueRange)
Add new operands that are forwarded to the successor.
unsigned size() const
Returns the amount of operands passed to the successor.
This class represents an instance of an SSA value in the MLIR system, representing a computable value...
Definition Value.h:96
bool use_empty() const
Returns true if this value has no uses.
Definition Value.h:208
use_range getUses() const
Returns a range of all uses, which is useful for iterating over all uses.
Definition Value.h:188
Block * getParentBlock()
Return the Block in which this Value is defined.
Definition Value.cpp:46
user_range getUsers() const
Definition Value.h:218
Location getLoc() const
Return the location of this value.
Definition Value.cpp:24
Operation * getDefiningOp() const
If this value is the result of an operation, return the operation that defines it.
Definition Value.cpp:18
Region * getParentRegion()
Return the Region in which this Value is defined.
Definition Value.cpp:39
DomTree & getDomTree(Region *region) const
Definition Dominance.h:101
detail::InFlightRemark failed(Location loc, RemarkOpts opts)
Report an optimization remark that failed.
Definition Remarks.h:561
Include the generated interface declarations.
const FrozenRewritePatternSet GreedyRewriteConfig bool * changed
SetVector< Block * > getBlocksSortedByDominance(Region &region)
Gets a list of blocks that is sorted according to dominance.
llvm::SetVector< T, Vector, Set, N > SetVector
Definition LLVM.h:131
llvm::DenseMap< KeyT, ValueT, KeyInfoT, BucketT > DenseMap
Definition LLVM.h:126
LogicalResult tryToPromoteMemorySlots(ArrayRef< PromotableAllocationOpInterface > allocators, OpBuilder &builder, const DataLayout &dataLayout, DominanceInfo &dominance, Mem2RegStatistics statistics={})
Attempts to promote the memory slots of the provided allocators.
Definition Mem2Reg.cpp:644
void getForwardSlice(Operation *op, SetVector< Operation * > *forwardSlice, const ForwardSliceOptions &options={})
Fills forwardSlice with the computed forward slice (i.e.
Statistics collected while applying mem2reg.
Definition Mem2Reg.h:18
llvm::Statistic * promotedAmount
Total amount of memory slots promoted.
Definition Mem2Reg.h:20
llvm::Statistic * newBlockArgumentAmount
Total amount of new block arguments inserted in blocks.
Definition Mem2Reg.h:22
Represents a slot in memory.
Value ptr
Pointer to the memory slot, used by operations to refer to it.
Type elemType
Type of the value contained in the slot.