MLIR
20.0.0git
|
Namespaces | |
matcher | |
Classes | |
struct | LoopReduction |
A description of a (parallelizable) reduction in an affine loop. More... | |
struct | MemRefAccess |
Encapsulates a memref load or store access information. More... | |
struct | DependenceComponent |
struct | DependenceResult |
Checks whether two accesses to the same memref access the same element. More... | |
class | FlatAffineValueConstraints |
FlatAffineValueConstraints is an extension of FlatLinearValueConstraints with helper functions for Affine dialect ops. More... | |
class | FlatAffineRelation |
A FlatAffineRelation represents a set of ordered pairs (domain -> range) where "domain" and "range" are tuples of variables. More... | |
class | NestedMatch |
An NestedPattern captures nested patterns in the IR. More... | |
class | NestedPattern |
class | NestedPatternContext |
RAII structure to transparently manage the bump allocator for NestedPattern and NestedMatch classes. More... | |
struct | LoopNestStateCollector |
struct | MemRefDependenceGraph |
struct | SliceComputationResult |
Enumerates different result statuses of slice computation by computeSliceUnion More... | |
struct | ComputationSliceState |
ComputationSliceState aggregates loop IVs, loop bound AffineMaps and their associated operands for a set of loops within a loop nest (typically the set of loops surrounding a store operation). More... | |
struct | MemRefRegion |
A region of a memref's data space; this is typically constructed by analyzing load/store op's on this memref and the index space of loops surrounding such op's. More... | |
class | AffineDmaStartOp |
AffineDmaStartOp starts a non-blocking DMA operation that transfers data from a source memref to a destination memref. More... | |
class | AffineDmaWaitOp |
AffineDmaWaitOp blocks until the completion of a DMA operation associated with the tag element 'tag[index]'. More... | |
class | AffineBound |
AffineBound represents a lower or upper bound in the for operation. More... | |
class | AffineValueMap |
An AffineValueMap is an affine map plus its ML value operands and results for analysis purposes. More... | |
struct | FusionResult |
class | FusionStrategy |
Describes the fusion strategy to be used in the Affine loop fusion utilities. More... | |
struct | LoopNestStats |
LoopNestStats aggregates various per-loop statistics (eg. More... | |
struct | AffineCopyOptions |
Explicit copy / DMA generation options for mlir::affineDataCopyGenerate. More... | |
struct | CopyGenerateResult |
Result for calling generateCopyForMemRegion. More... | |
struct | VectorizationStrategy |
Holds parameters to perform n-D vectorization on a single loop nest. More... | |
struct | DivModValue |
Holds the result of (div a, b) and (mod a, b). More... | |
struct | AffineValueExpr |
struct | AffineBuilder |
Helper struct to build simple AffineValueExprs with minimal type inference support. More... | |
Typedefs | |
using | VectorizableLoopFun = std::function< bool(AffineForOp)> |
using | FilterFunctionType = std::function< bool(Operation &)> |
A NestedPattern is a nested operation walker that: More... | |
using | ReductionLoopMap = DenseMap< Operation *, SmallVector< LoopReduction, 2 > > |
Enumerations | |
enum | FusionMode { Greedy , ProducerConsumer , Sibling } |
Fusion mode to attempt. More... | |
Functions | |
void | getSupportedReductions (AffineForOp forOp, SmallVectorImpl< LoopReduction > &supportedReductions) |
Populate supportedReductions with descriptors of the supported reductions. More... | |
bool | isLoopParallel (AffineForOp forOp, SmallVectorImpl< LoopReduction > *parallelReductions=nullptr) |
Returns true if ‘forOp’ is a parallel loop. More... | |
bool | isLoopMemoryParallel (AffineForOp forOp) |
Returns true if ‘forOp’ doesn't have memory dependences preventing parallelization. More... | |
void | getReachableAffineApplyOps (ArrayRef< Value > operands, SmallVectorImpl< Operation * > &affineApplyOps) |
Returns in affineApplyOps , the sequence of those AffineApplyOp Operations that are reachable via a search starting from operands and ending at those operands that are not the result of an AffineApplyOp. More... | |
LogicalResult | getIndexSet (MutableArrayRef< Operation * > ops, FlatAffineValueConstraints *domain) |
Builds a system of constraints with dimensional variables corresponding to the loop IVs of the forOps and AffineIfOp's operands appearing in that order. More... | |
DependenceResult | checkMemrefAccessDependence (const MemRefAccess &srcAccess, const MemRefAccess &dstAccess, unsigned loopDepth, FlatAffineValueConstraints *dependenceConstraints=nullptr, SmallVector< DependenceComponent, 2 > *dependenceComponents=nullptr, bool allowRAR=false) |
bool | hasDependence (DependenceResult result) |
Utility function that returns true if the provided DependenceResult corresponds to a dependence result. More... | |
bool | noDependence (DependenceResult result) |
Returns true if the provided DependenceResult corresponds to the absence of a dependence. More... | |
void | getDependenceComponents (AffineForOp forOp, unsigned maxLoopDepth, std::vector< SmallVector< DependenceComponent, 2 >> *depCompsVec) |
Returns in 'depCompsVec', dependence components for dependences between all load and store ops in loop nest rooted at 'forOp', at loop depths in range [1, maxLoopDepth]. More... | |
LogicalResult | getRelationFromMap (AffineMap &map, presburger::IntegerRelation &rel) |
Builds a relation from the given AffineMap/AffineValueMap map , containing all pairs of the form operands -> result that satisfy map . More... | |
LogicalResult | getRelationFromMap (const AffineValueMap &map, presburger::IntegerRelation &rel) |
void | getTripCountMapAndOperands (AffineForOp forOp, AffineMap *map, SmallVectorImpl< Value > *operands) |
Returns the trip count of the loop as an affine map with its corresponding operands if the latter is expressible as an affine expression, and nullptr otherwise. More... | |
std::optional< uint64_t > | getConstantTripCount (AffineForOp forOp) |
Returns the trip count of the loop if it's a constant, std::nullopt otherwise. More... | |
uint64_t | getLargestDivisorOfTripCount (AffineForOp forOp) |
Returns the greatest known integral divisor of the trip count. More... | |
template<typename LoadOrStoreOp > | |
bool | isInvariantAccess (LoadOrStoreOp memOp, AffineForOp forOp) |
Checks if an affine read or write operation depends on forOp 's IV, i.e., if the memory access is invariant on forOp . More... | |
DenseSet< Value, DenseMapInfo< Value > > | getInvariantAccesses (Value iv, ArrayRef< Value > indices) |
Given an induction variable iv of type AffineForOp and indices of type IndexType, returns the set of indices that are independent of iv . More... | |
template<typename LoadOrStoreOp > | |
bool | isContiguousAccess (Value iv, LoadOrStoreOp memoryOp, int *memRefDim) |
Given: More... | |
bool | isVectorizableLoopBody (AffineForOp loop, NestedPattern &vectorTransferMatcher) |
Checks whether the loop is structurally vectorizable; i.e. More... | |
bool | isVectorizableLoopBody (AffineForOp loop, int *memRefDim, NestedPattern &vectorTransferMatcher) |
Checks whether the loop is structurally vectorizable and that all the LoadOp and StoreOp matched have access indexing functions that are either: More... | |
bool | isOpwiseShiftValid (AffineForOp forOp, ArrayRef< uint64_t > shifts) |
Checks where SSA dominance would be violated if a for op's body operations are shifted by the specified shifts. More... | |
bool | isTilingValid (ArrayRef< AffineForOp > loops) |
Checks whether hyper-rectangular loop tiling of the nest represented by loops is valid. More... | |
bool | defaultFilterFunction (Operation &) |
void | getAffineForIVs (Operation &op, SmallVectorImpl< AffineForOp > *loops) |
Populates 'loops' with IVs of the affine.for ops surrounding 'op' ordered from the outermost 'affine.for' operation to the innermost one while not traversing outside of the surrounding affine scope. More... | |
void | getAffineIVs (Operation &op, SmallVectorImpl< Value > &ivs) |
Populates 'ivs' with IVs of the surrounding affine.for and affine.parallel ops ordered from the outermost one to the innermost while not traversing outside of the surrounding affine scope. More... | |
void | getEnclosingAffineOps (Operation &op, SmallVectorImpl< Operation * > *ops) |
Populates 'ops' with affine operations enclosing op ordered from outermost to innermost while stopping at the boundary of the affine scope. More... | |
unsigned | getNestingDepth (Operation *op) |
Returns the nesting depth of this operation, i.e., the number of loops surrounding this operation. More... | |
bool | isLoopParallelAndContainsReduction (AffineForOp forOp) |
Returns whether a loop is a parallel loop and contains a reduction loop. More... | |
void | getSequentialLoops (AffineForOp forOp, llvm::SmallDenseSet< Value, 8 > *sequentialLoops) |
Returns in 'sequentialLoops' all sequential loops in loop nest rooted at 'forOp'. More... | |
void | getComputationSliceState (Operation *depSourceOp, Operation *depSinkOp, FlatAffineValueConstraints *dependenceConstraints, unsigned loopDepth, bool isBackwardSlice, ComputationSliceState *sliceState) |
Computes the computation slice loop bounds for one loop nest as affine maps of the other loop nest's IVs and symbols, using 'dependenceConstraints' computed between 'depSourceAccess' and 'depSinkAccess'. More... | |
uint64_t | getSliceIterationCount (const llvm::SmallDenseMap< Operation *, uint64_t, 8 > &sliceTripCountMap) |
Return the number of iterations for the slicetripCountMap provided. More... | |
bool | buildSliceTripCountMap (const ComputationSliceState &slice, llvm::SmallDenseMap< Operation *, uint64_t, 8 > *tripCountMap) |
Builds a map 'tripCountMap' from AffineForOp to constant trip count for loop nest surrounding represented by slice loop bounds in 'slice'. More... | |
SliceComputationResult | computeSliceUnion (ArrayRef< Operation * > opsA, ArrayRef< Operation * > opsB, unsigned loopDepth, unsigned numCommonLoops, bool isBackwardSlice, ComputationSliceState *sliceUnion) |
Computes in 'sliceUnion' the union of all slice bounds computed at 'loopDepth' between all dependent pairs of ops in 'opsA' and 'opsB', and then verifies if it is valid. More... | |
AffineForOp | insertBackwardComputationSlice (Operation *srcOpInst, Operation *dstOpInst, unsigned dstLoopDepth, ComputationSliceState *sliceState) |
Creates a clone of the computation contained in the loop nest surrounding 'srcOpInst', slices the iteration space of src loop based on slice bounds in 'sliceState', and inserts the computation slice at the beginning of the operation block of the loop at 'dstLoopDepth' in the loop nest surrounding 'dstOpInst'. More... | |
std::optional< uint64_t > | getIntOrFloatMemRefSizeInBytes (MemRefType memRefType) |
Returns the size of a memref with element type int or float in bytes if it's statically shaped, std::nullopt otherwise. More... | |
template<typename LoadOrStoreOpPointer > | |
LogicalResult | boundCheckLoadOrStoreOp (LoadOrStoreOpPointer loadOrStoreOp, bool emitError=true) |
Checks a load or store op for an out of bound access; returns failure if the access is out of bounds along any of the dimensions, success otherwise. More... | |
unsigned | getNumCommonSurroundingLoops (Operation &a, Operation &b) |
Returns the number of surrounding loops common to both A and B. More... | |
std::optional< int64_t > | getMemoryFootprintBytes (AffineForOp forOp, int memorySpace=-1) |
Gets the memory footprint of all data touched in the specified memory space in bytes; if the memory space is unspecified, considers all memory spaces. More... | |
std::optional< int64_t > | getMemRefIntOrFloatEltSizeInBytes (MemRefType memRefType) |
Returns the memref's element type's size in bytes where the elemental type is an int or float or a vector of such types. More... | |
IntegerSet | simplifyIntegerSet (IntegerSet set) |
Simplify the integer set by simplifying the underlying affine expressions by flattening and some simple inference. More... | |
unsigned | getInnermostCommonLoopDepth (ArrayRef< Operation * > ops, SmallVectorImpl< AffineForOp > *surroundingLoops=nullptr) |
Returns the innermost common loop depth for the set of operations in 'ops'. More... | |
FailureOr< AffineValueMap > | simplifyConstrainedMinMaxOp (Operation *op, FlatAffineValueConstraints constraints) |
Try to simplify the given affine.min or affine.max op to an affine map with a single result and operands, taking into account the specified constraint set. More... | |
bool | isTopLevelValue (Value value) |
A utility function to check if a value is defined at the top level of an op with trait AffineScope or is a region argument for such an op. More... | |
bool | isTopLevelValue (Value value, Region *region) |
A utility function to check if a value is defined at the top level of region or is an argument of region . More... | |
Region * | getAffineScope (Operation *op) |
Returns the closest region enclosing op that is held by an operation with trait AffineScope ; nullptr if there is no such region. More... | |
bool | isValidDim (Value value) |
Returns true if the given Value can be used as a dimension id in the region of the closest surrounding op that has the trait AffineScope . More... | |
bool | isValidDim (Value value, Region *region) |
Returns true if the given Value can be used as a dimension id in region , i.e., for all its uses in region . More... | |
bool | isValidSymbol (Value value) |
Returns true if the given value can be used as a symbol in the region of the closest surrounding op that has the trait AffineScope . More... | |
bool | isValidSymbol (Value value, Region *region) |
Returns true if the given Value can be used as a symbol for region , i.e., for all its uses in region . More... | |
ParseResult | parseDimAndSymbolList (OpAsmParser &parser, SmallVectorImpl< Value > &operands, unsigned &numDims) |
Parses dimension and symbol list. More... | |
void | canonicalizeMapAndOperands (AffineMap *map, SmallVectorImpl< Value > *operands) |
Modifies both map and operands in-place so as to: More... | |
void | canonicalizeSetAndOperands (IntegerSet *set, SmallVectorImpl< Value > *operands) |
Canonicalizes an integer set the same way canonicalizeMapAndOperands does for affine maps. More... | |
AffineApplyOp | makeComposedAffineApply (OpBuilder &b, Location loc, AffineMap map, ArrayRef< OpFoldResult > operands) |
Returns a composed AffineApplyOp by composing map and operands with other AffineApplyOps supplying those operands. More... | |
AffineApplyOp | makeComposedAffineApply (OpBuilder &b, Location loc, AffineExpr e, ArrayRef< OpFoldResult > operands) |
OpFoldResult | makeComposedFoldedAffineApply (OpBuilder &b, Location loc, AffineMap map, ArrayRef< OpFoldResult > operands) |
Constructs an AffineApplyOp that applies map to operands after composing the map with the maps of any other AffineApplyOp supplying the operands, then immediately attempts to fold it. More... | |
OpFoldResult | makeComposedFoldedAffineApply (OpBuilder &b, Location loc, AffineExpr expr, ArrayRef< OpFoldResult > operands) |
Variant of makeComposedFoldedAffineApply that applies to an expression. More... | |
SmallVector< OpFoldResult > | makeComposedFoldedMultiResultAffineApply (OpBuilder &b, Location loc, AffineMap map, ArrayRef< OpFoldResult > operands) |
Variant of makeComposedFoldedAffineApply suitable for multi-result maps. More... | |
AffineMinOp | makeComposedAffineMin (OpBuilder &b, Location loc, AffineMap map, ArrayRef< OpFoldResult > operands) |
Returns an AffineMinOp obtained by composing map and operands with AffineApplyOps supplying those operands. More... | |
OpFoldResult | makeComposedFoldedAffineMin (OpBuilder &b, Location loc, AffineMap map, ArrayRef< OpFoldResult > operands) |
Constructs an AffineMinOp that computes a minimum across the results of applying map to operands , then immediately attempts to fold it. More... | |
OpFoldResult | makeComposedFoldedAffineMax (OpBuilder &b, Location loc, AffineMap map, ArrayRef< OpFoldResult > operands) |
Constructs an AffineMinOp that computes a maximum across the results of applying map to operands , then immediately attempts to fold it. More... | |
void | fullyComposeAffineMapAndOperands (AffineMap *map, SmallVectorImpl< Value > *operands) |
Given an affine map map and its input operands , this method composes into map , maps of AffineApplyOps whose results are the values in operands , iteratively until no more of operands are the result of an AffineApplyOp. More... | |
bool | isAffineForInductionVar (Value val) |
Returns true if the provided value is the induction variable of an AffineForOp. More... | |
bool | isAffineParallelInductionVar (Value val) |
Returns true if val is the induction variable of an AffineParallelOp. More... | |
bool | isAffineInductionVar (Value val) |
Returns true if the provided value is the induction variable of an AffineForOp or AffineParallelOp. More... | |
AffineForOp | getForInductionVarOwner (Value val) |
Returns the loop parent of an induction variable. More... | |
AffineParallelOp | getAffineParallelInductionVarOwner (Value val) |
Returns true if the provided value is among the induction variables of an AffineParallelOp. More... | |
void | extractForInductionVars (ArrayRef< AffineForOp > forInsts, SmallVectorImpl< Value > *ivs) |
Extracts the induction variables from a list of AffineForOps and places them in the output argument ivs . More... | |
void | extractInductionVars (ArrayRef< Operation * > affineOps, SmallVectorImpl< Value > &ivs) |
Extracts the induction variables from a list of either AffineForOp or AffineParallelOp and places them in the output argument ivs . More... | |
void | buildAffineLoopNest (OpBuilder &builder, Location loc, ArrayRef< int64_t > lbs, ArrayRef< int64_t > ubs, ArrayRef< int64_t > steps, function_ref< void(OpBuilder &, Location, ValueRange)> bodyBuilderFn=nullptr) |
Builds a perfect nest of affine.for loops, i.e., each loop except the innermost one contains only another loop and a terminator. More... | |
void | buildAffineLoopNest (OpBuilder &builder, Location loc, ValueRange lbs, ValueRange ubs, ArrayRef< int64_t > steps, function_ref< void(OpBuilder &, Location, ValueRange)> bodyBuilderFn=nullptr) |
void | registerValueBoundsOpInterfaceExternalModels (DialectRegistry ®istry) |
FailureOr< int64_t > | fullyComposeAndComputeConstantDelta (Value value1, Value value2) |
Compute a constant delta of the given two values. More... | |
FusionResult | canFuseLoops (AffineForOp srcForOp, AffineForOp dstForOp, unsigned dstLoopDepth, ComputationSliceState *srcSlice, FusionStrategy fusionStrategy=FusionStrategy::Generic) |
Checks the feasibility of fusing the loop nest rooted at 'srcForOp' into the loop nest rooted at 'dstForOp' at 'dstLoopDepth'. More... | |
void | fuseLoops (AffineForOp srcForOp, AffineForOp dstForOp, const ComputationSliceState &srcSlice, bool isInnermostSiblingInsertionFusion=false) |
Fuses 'srcForOp' into 'dstForOp' with destination loop block insertion point and source slice loop bounds specified in 'srcSlice'. More... | |
bool | getLoopNestStats (AffineForOp forOp, LoopNestStats *stats) |
Collect loop nest statistics (eg. More... | |
int64_t | getComputeCost (AffineForOp forOp, LoopNestStats &stats) |
Computes the total cost of the loop nest rooted at 'forOp' using 'stats'. More... | |
bool | getFusionComputeCost (AffineForOp srcForOp, LoopNestStats &srcStats, AffineForOp dstForOp, LoopNestStats &dstStats, const ComputationSliceState &slice, int64_t *computeCost) |
Computes and returns in 'computeCost', the total compute cost of fusing the 'slice' of the loop nest rooted at 'srcForOp' into 'dstForOp'. More... | |
void | gatherProducerConsumerMemrefs (ArrayRef< Operation * > srcOps, ArrayRef< Operation * > dstOps, DenseSet< Value > &producerConsumerMemrefs) |
Returns in 'producerConsumerMemrefs' the memrefs involved in a producer-consumer dependence between write ops in 'srcOps' and read ops in 'dstOps'. More... | |
LogicalResult | loopUnrollFull (AffineForOp forOp) |
Unrolls this for operation completely if the trip count is known to be constant. More... | |
LogicalResult | loopUnrollByFactor (AffineForOp forOp, uint64_t unrollFactor, function_ref< void(unsigned, Operation *, OpBuilder)> annotateFn=nullptr, bool cleanUpUnroll=false) |
Unrolls this for operation by the specified unroll factor. More... | |
LogicalResult | loopUnrollUpToFactor (AffineForOp forOp, uint64_t unrollFactor) |
Unrolls this loop by the specified unroll factor or its trip count, whichever is lower. More... | |
bool LLVM_ATTRIBUTE_UNUSED | isPerfectlyNested (ArrayRef< AffineForOp > loops) |
Returns true if loops is a perfectly nested loop nest, where loops appear in it from outermost to innermost. More... | |
void | getPerfectlyNestedLoops (SmallVectorImpl< AffineForOp > &nestedLoops, AffineForOp root) |
Get perfectly nested sequence of loops starting at root of loop nest (the first op being another AffineFor, and the second op - a terminator). More... | |
LogicalResult | loopUnrollJamByFactor (AffineForOp forOp, uint64_t unrollJamFactor) |
Unrolls and jams this loop by the specified factor. More... | |
LogicalResult | loopUnrollJamUpToFactor (AffineForOp forOp, uint64_t unrollJamFactor) |
Unrolls and jams this loop by the specified factor or by the trip count (if constant), whichever is lower. More... | |
LogicalResult | promoteIfSingleIteration (AffineForOp forOp) |
Promotes the loop body of a AffineForOp to its containing block if the loop was known to have a single iteration. More... | |
void | promoteSingleIterationLoops (func::FuncOp f) |
Promotes all single iteration AffineForOp's in the Function, i.e., moves their body into the containing Block. More... | |
LogicalResult | affineForOpBodySkew (AffineForOp forOp, ArrayRef< uint64_t > shifts, bool unrollPrologueEpilogue=false) |
Skew the operations in an affine.for's body with the specified operation-wise shifts. More... | |
void | getTileableBands (func::FuncOp f, std::vector< SmallVector< AffineForOp, 6 >> *bands) |
Identify valid and profitable bands of loops to tile. More... | |
LogicalResult | tilePerfectlyNested (MutableArrayRef< AffineForOp > input, ArrayRef< unsigned > tileSizes, SmallVectorImpl< AffineForOp > *tiledNest=nullptr) |
Tiles the specified band of perfectly nested loops creating tile-space loops and intra-tile loops. More... | |
LogicalResult | tilePerfectlyNestedParametric (MutableArrayRef< AffineForOp > input, ArrayRef< Value > tileSizes, SmallVectorImpl< AffineForOp > *tiledNest=nullptr) |
Tiles the specified band of perfectly nested loops creating tile-space loops and intra-tile loops, using SSA values as tiling parameters. More... | |
void | interchangeLoops (AffineForOp forOpA, AffineForOp forOpB) |
Performs loop interchange on 'forOpA' and 'forOpB'. More... | |
bool | isValidLoopInterchangePermutation (ArrayRef< AffineForOp > loops, ArrayRef< unsigned > loopPermMap) |
Checks if the loop interchange permutation 'loopPermMap', of the perfectly nested sequence of loops in 'loops', would violate dependences (loop 'i' in 'loops' is mapped to location 'j = 'loopPermMap[i]' in the interchange). More... | |
unsigned | permuteLoops (ArrayRef< AffineForOp > inputNest, ArrayRef< unsigned > permMap) |
Performs a loop permutation on a perfectly nested loop nest inputNest (where the contained loops appear from outer to inner) as specified by the permutation permMap : loop 'i' in inputNest is mapped to location 'loopPermMap[i]', where positions 0, 1, ... More... | |
AffineForOp | sinkSequentialLoops (AffineForOp forOp) |
SmallVector< SmallVector< AffineForOp, 8 >, 8 > | tile (ArrayRef< AffineForOp > forOps, ArrayRef< uint64_t > sizes, ArrayRef< AffineForOp > targets) |
Performs tiling fo imperfectly nested loops (with interchange) by strip-mining the forOps by sizes and sinking them, in their order of occurrence in forOps , under each of the targets . More... | |
SmallVector< AffineForOp, 8 > | tile (ArrayRef< AffineForOp > forOps, ArrayRef< uint64_t > sizes, AffineForOp target) |
Performs tiling (with interchange) by strip-mining the forOps by sizes and sinking them, in their order of occurrence in forOps , under target . More... | |
LogicalResult | affineDataCopyGenerate (Block::iterator begin, Block::iterator end, const AffineCopyOptions ©Options, std::optional< Value > filterMemRef, DenseSet< Operation * > ©Nests) |
Performs explicit copying for the contiguous sequence of operations in the block iterator range [‘begin’, ‘end’), where ‘end’ can't be past the terminator of the block (since additional operations are potentially inserted right before end . More... | |
LogicalResult | affineDataCopyGenerate (AffineForOp forOp, const AffineCopyOptions ©Options, std::optional< Value > filterMemRef, DenseSet< Operation * > ©Nests) |
A convenience version of affineDataCopyGenerate for all ops in the body of an AffineForOp. More... | |
LogicalResult | generateCopyForMemRegion (const MemRefRegion &memrefRegion, Operation *analyzedOp, const AffineCopyOptions ©Options, CopyGenerateResult &result) |
generateCopyForMemRegion is similar to affineDataCopyGenerate, but works with a single memref region. More... | |
LogicalResult | coalesceLoops (MutableArrayRef< AffineForOp > loops) |
Replace a perfect nest of "for" loops with a single linearized loop. More... | |
void | mapLoopToProcessorIds (scf::ForOp forOp, ArrayRef< Value > processorId, ArrayRef< Value > numProcessors) |
Maps forOp for execution on a parallel grid of virtual processorIds of size given by numProcessors . More... | |
void | gatherLoops (func::FuncOp func, std::vector< SmallVector< AffineForOp, 2 >> &depthToLoops) |
Gathers all AffineForOps in 'func.func' grouped by loop depth. More... | |
AffineForOp | createCanonicalizedAffineForOp (OpBuilder b, Location loc, ValueRange lbOperands, AffineMap lbMap, ValueRange ubOperands, AffineMap ubMap, int64_t step=1) |
Creates an AffineForOp while ensuring that the lower and upper bounds are canonicalized, i.e., unused and duplicate operands are removed, any constant operands propagated/folded in, and duplicate bound maps dropped. More... | |
LogicalResult | separateFullTiles (MutableArrayRef< AffineForOp > nest, SmallVectorImpl< AffineForOp > *fullTileNest=nullptr) |
Separates full tiles from partial tiles for a perfect nest nest by generating a conditional guard that selects between the full tile version and the partial tile version using an AffineIfOp. More... | |
LogicalResult | coalescePerfectlyNestedAffineLoops (AffineForOp op) |
Walk an affine.for to find a band to coalesce. More... | |
int64_t | numEnclosingInvariantLoops (OpOperand &operand) |
Count the number of loops surrounding operand such that operand could be hoisted above. More... | |
std::unique_ptr< OperationPass< func::FuncOp > > | createSimplifyAffineStructuresPass () |
Creates a simplification pass for affine structures (maps and sets). More... | |
std::unique_ptr< OperationPass< func::FuncOp > > | createAffineLoopInvariantCodeMotionPass () |
Creates a loop invariant code motion pass that hoists loop invariant operations out of affine loops. More... | |
std::unique_ptr< OperationPass< func::FuncOp > > | createAffineParallelizePass () |
Creates a pass to convert all parallel affine.for's into 1-d affine.parallel ops. More... | |
std::unique_ptr< OperationPass< func::FuncOp > > | createAffineLoopNormalizePass (bool promoteSingleIter=false) |
Apply normalization transformations to affine loop-like ops. More... | |
std::unique_ptr< OperationPass< func::FuncOp > > | createAffineDataCopyGenerationPass (unsigned slowMemorySpace, unsigned fastMemorySpace, unsigned tagMemorySpace=0, int minDmaTransferSize=1024, uint64_t fastMemCapacityBytes=std::numeric_limits< uint64_t >::max()) |
Performs packing (or explicit copying) of accessed memref regions into buffers in the specified faster memory space through either pointwise copies or DMA operations. More... | |
std::unique_ptr< OperationPass< func::FuncOp > > | createAffineDataCopyGenerationPass () |
Overload relying on pass options for initialization. More... | |
std::unique_ptr< OperationPass< func::FuncOp > > | createAffineScalarReplacementPass () |
Creates a pass to replace affine memref accesses by scalars using store to load forwarding and redundant load elimination; consequently also eliminate dead allocs. More... | |
std::unique_ptr< OperationPass< func::FuncOp > > | createLoopCoalescingPass () |
Creates a pass that transforms perfectly nested loops with independent bounds into a single loop. More... | |
std::unique_ptr< Pass > | createLoopFusionPass (unsigned fastMemorySpace=0, uint64_t localBufSizeThreshold=0, bool maximalFusion=false, enum FusionMode fusionMode=FusionMode::Greedy) |
Creates a loop fusion pass which fuses affine loop nests at the top-level of the operation the pass is created on according to the type of fusion specified in fusionMode . More... | |
std::unique_ptr< OperationPass< func::FuncOp > > | createLoopTilingPass (uint64_t cacheSizeBytes) |
Creates a pass to perform tiling on loop nests. More... | |
std::unique_ptr< OperationPass< func::FuncOp > > | createLoopTilingPass () |
Overload relying on pass options for initialization. More... | |
std::unique_ptr< OperationPass< func::FuncOp > > | createLoopUnrollPass (int unrollFactor=-1, bool unrollUpToFactor=false, bool unrollFull=false, const std::function< unsigned(AffineForOp)> &getUnrollFactor=nullptr) |
Creates a loop unrolling pass with the provided parameters. More... | |
std::unique_ptr< OperationPass< func::FuncOp > > | createLoopUnrollAndJamPass (int unrollJamFactor=-1) |
Creates a loop unroll jam pass to unroll jam by the specified factor. More... | |
std::unique_ptr< OperationPass< func::FuncOp > > | createPipelineDataTransferPass () |
Creates a pass to pipeline explicit movement of data across levels of the memory hierarchy. More... | |
std::unique_ptr< Pass > | createAffineExpandIndexOpsPass () |
Creates a pass to expand affine index operations into more fundamental operations (not necessarily restricted to Affine dialect). More... | |
std::unique_ptr< Pass > | createAffineExpandIndexOpsAsAffinePass () |
Creates a pass to expand affine index operations into affine.apply operations. More... | |
void | registerTransformDialectExtension (DialectRegistry ®istry) |
void | populateAffineExpandIndexOpsPatterns (RewritePatternSet &patterns) |
Populate patterns that expand affine index operations into more fundamental operations (not necessarily restricted to Affine dialect). More... | |
void | populateAffineExpandIndexOpsAsAffinePatterns (RewritePatternSet &patterns) |
Populate patterns that expand affine index operations into their equivalent affine.apply representations. More... | |
void | reorderOperandsByHoistability (RewriterBase &rewriter, AffineApplyOp op) |
Helper function to rewrite op 's affine map and reorder its operands such that they are in increasing order of hoistability (i.e. More... | |
FailureOr< AffineApplyOp > | decompose (RewriterBase &rewriter, AffineApplyOp op) |
Split an "affine.apply" operation into smaller ops. More... | |
FailureOr< OpFoldResult > | reifyValueBound (OpBuilder &b, Location loc, presburger::BoundType type, const ValueBoundsConstraintSet::Variable &var, ValueBoundsConstraintSet::StopConditionFn stopCondition, bool closedUB=false) |
Reify a bound for the given variable in terms of SSA values for which stopCondition is met. More... | |
FailureOr< OpFoldResult > | reifyIndexValueBound (OpBuilder &b, Location loc, presburger::BoundType type, Value value, ValueBoundsConstraintSet::StopConditionFn stopCondition=nullptr, bool closedUB=false) |
Reify a bound for the given index-typed value in terms of SSA values for which stopCondition is met. More... | |
FailureOr< OpFoldResult > | reifyShapedValueDimBound (OpBuilder &b, Location loc, presburger::BoundType type, Value value, int64_t dim, ValueBoundsConstraintSet::StopConditionFn stopCondition=nullptr, bool closedUB=false) |
Reify a bound for the specified dimension of the given shaped value in terms of SSA values for which stopCondition is met. More... | |
OpFoldResult | materializeComputedBound (OpBuilder &b, Location loc, AffineMap boundMap, ArrayRef< std::pair< Value, std::optional< int64_t >>> mapOperands) |
Materialize an already computed bound with Affine dialect ops. More... | |
LogicalResult | affineParallelize (AffineForOp forOp, ArrayRef< LoopReduction > parallelReductions={}, AffineParallelOp *resOp=nullptr) |
Replaces a parallel affine.for op with a 1-d affine.parallel op. More... | |
LogicalResult | hoistAffineIfOp (AffineIfOp ifOp, bool *folded=nullptr) |
Hoists out affine.if/else to as high as possible, i.e., past all invariant affine.fors/parallel's. More... | |
void | affineScalarReplace (func::FuncOp f, DominanceInfo &domInfo, PostDominanceInfo &postDomInfo, AliasAnalysis &analysis) |
Replace affine store and load accesses by scalars by forwarding stores to loads and eliminate invariant affine loads; consequently, eliminate dead allocs. More... | |
void | vectorizeAffineLoops (Operation *parentOp, llvm::DenseSet< Operation *, DenseMapInfo< Operation * >> &loops, ArrayRef< int64_t > vectorSizes, ArrayRef< int64_t > fastestVaryingPattern, const ReductionLoopMap &reductionLoops=ReductionLoopMap()) |
Vectorizes affine loops in 'loops' using the n-D vectorization factors in 'vectorSizes'. More... | |
LogicalResult | vectorizeAffineLoopNest (std::vector< SmallVector< AffineForOp, 2 >> &loops, const VectorizationStrategy &strategy) |
External utility to vectorize affine loops from a single loop nest using an n-D vectorization strategy (see doc in VectorizationStrategy definition). More... | |
void | normalizeAffineParallel (AffineParallelOp op) |
Normalize a affine.parallel op so that lower bounds are 0 and steps are 1. More... | |
LogicalResult | normalizeAffineFor (AffineForOp op, bool promoteSingleIter=false) |
Normalize an affine.for op. More... | |
AffineExpr | substWithMin (AffineExpr e, AffineExpr dim, AffineExpr min, AffineExpr max, bool positivePath=true) |
Traverse e and return an AffineExpr where all occurrences of dim have been replaced by either: More... | |
LogicalResult | replaceAllMemRefUsesWith (Value oldMemRef, Value newMemRef, ArrayRef< Value > extraIndices={}, AffineMap indexRemap=AffineMap(), ArrayRef< Value > extraOperands={}, ArrayRef< Value > symbolOperands={}, Operation *domOpFilter=nullptr, Operation *postDomOpFilter=nullptr, bool allowNonDereferencingOps=false, bool replaceInDeallocOp=false) |
Replaces all "dereferencing" uses of oldMemRef with newMemRef while optionally remapping the old memref's indices using the supplied affine map, indexRemap . More... | |
LogicalResult | replaceAllMemRefUsesWith (Value oldMemRef, Value newMemRef, Operation *op, ArrayRef< Value > extraIndices={}, AffineMap indexRemap=AffineMap(), ArrayRef< Value > extraOperands={}, ArrayRef< Value > symbolOperands={}, bool allowNonDereferencingOps=false) |
Performs the same replacement as the other version above but only for the dereferencing uses of oldMemRef in op , except in cases where 'allowNonDereferencingOps' is set to true where we replace the non-dereferencing uses as well. More... | |
LogicalResult | normalizeMemRef (memref::AllocOp *op) |
Rewrites the memref defined by this alloc op to have an identity layout map and updates all its indexing uses. More... | |
MemRefType | normalizeMemRefType (MemRefType memrefType) |
Normalizes memrefType so that the affine layout map of the memref is transformed to an identity map with a new shape being computed for the normalized memref type and returns it. More... | |
void | createAffineComputationSlice (Operation *opInst, SmallVectorImpl< AffineApplyOp > *sliceOps) |
Given an operation, inserts one or more single result affine apply operations, results of which are exclusively used by this operation. More... | |
Value | expandAffineExpr (OpBuilder &builder, Location loc, AffineExpr expr, ValueRange dimValues, ValueRange symbolValues) |
Emit code that computes the given affine expression using standard arithmetic operations applied to the provided dimension and symbol values. More... | |
std::optional< SmallVector< Value, 8 > > | expandAffineMap (OpBuilder &builder, Location loc, AffineMap affineMap, ValueRange operands) |
Create a sequence of operations that implement the affineMap applied to the given operands (as it it were an AffineApplyOp). More... | |
DivModValue | getDivMod (OpBuilder &b, Location loc, Value lhs, Value rhs) |
Create IR to calculate (div lhs, rhs) and (mod lhs, rhs). More... | |
FailureOr< SmallVector< Value > > | delinearizeIndex (OpBuilder &b, Location loc, Value linearIndex, ArrayRef< Value > basis, bool hasOuterBound=true) |
Generate the IR to delinearize linearIndex given the basis and return the multi-index. More... | |
FailureOr< SmallVector< Value > > | delinearizeIndex (OpBuilder &b, Location loc, Value linearIndex, ArrayRef< OpFoldResult > basis, bool hasOuterBound=true) |
OpFoldResult | linearizeIndex (ArrayRef< OpFoldResult > multiIndex, ArrayRef< OpFoldResult > basis, ImplicitLocOpBuilder &builder) |
OpFoldResult | linearizeIndex (OpBuilder &builder, Location loc, ArrayRef< OpFoldResult > multiIndex, ArrayRef< OpFoldResult > basis) |
template<typename EffectType , typename T > | |
bool | hasNoInterveningEffect (Operation *start, T memOp, llvm::function_ref< bool(Value, Value)> mayAlias) |
Ensure that all operations that could be executed after start (noninclusive) and prior to memOp (e.g. More... | |
LogicalResult | mergeOffsetsSizesAndStrides (OpBuilder &builder, Location loc, ArrayRef< OpFoldResult > producerOffsets, ArrayRef< OpFoldResult > producerSizes, ArrayRef< OpFoldResult > producerStrides, const llvm::SmallBitVector &droppedProducerDims, ArrayRef< OpFoldResult > consumerOffsets, ArrayRef< OpFoldResult > consumerSizes, ArrayRef< OpFoldResult > consumerStrides, SmallVector< OpFoldResult > &combinedOffsets, SmallVector< OpFoldResult > &combinedSizes, SmallVector< OpFoldResult > &combinedStrides) |
Fills the combinedOffsets , combinedSizes and combinedStrides to use when combining a producer slice into a consumer slice. More... | |
LogicalResult | mergeOffsetsSizesAndStrides (OpBuilder &builder, Location loc, OffsetSizeAndStrideOpInterface producer, OffsetSizeAndStrideOpInterface consumer, const llvm::SmallBitVector &droppedProducerDims, SmallVector< OpFoldResult > &combinedOffsets, SmallVector< OpFoldResult > &combinedSizes, SmallVector< OpFoldResult > &combinedStrides) |
Fills the combinedOffsets , combinedSizes and combinedStrides to use when combining a producer slice op into a consumer slice op. More... | |
void | resolveIndicesIntoOpWithOffsetsAndStrides (RewriterBase &rewriter, Location loc, ArrayRef< OpFoldResult > mixedSourceOffsets, ArrayRef< OpFoldResult > mixedSourceStrides, const llvm::SmallBitVector &rankReducedDims, ArrayRef< OpFoldResult > consumerIndices, SmallVectorImpl< Value > &resolvedIndices) |
Given the 'consumerIndices' of a load/store operation operating on an op with offsets and strides, return the combined indices. More... | |
void | resolveIndicesIntoOpWithOffsetsAndStrides (RewriterBase &rewriter, Location loc, ArrayRef< OpFoldResult > mixedSourceOffsets, ArrayRef< OpFoldResult > mixedSourceStrides, const llvm::SmallBitVector &rankReducedDims, ValueRange consumerIndices, SmallVectorImpl< Value > &resolvedIndices) |
void | resolveSizesIntoOpWithSizes (ArrayRef< OpFoldResult > sourceSizes, ArrayRef< OpFoldResult > destSizes, const llvm::SmallBitVector &rankReducedSourceDims, SmallVectorImpl< OpFoldResult > &resolvedSizes) |
Given sourceSizes , destSizes and information about which dimensions are dropped by the source: rankReducedSourceDims , compute the resolved sizes that correspond to dest_op(source_op). More... | |
using mlir::affine::FilterFunctionType = typedef std::function<bool(Operation &)> |
A NestedPattern is a nested operation walker that:
Nested patterns are meant to capture imperfectly nested loops while matching properties over the whole loop nest. For instance, in vectorization we are interested in capturing all the imperfectly nested loops of a certain type and such that all the load and stores have certain access patterns along the loops' induction variables). Such NestedMatches are first captured using the match
function and are later processed to analyze properties and apply transformations in a non-greedy way.
The NestedMatches captured in the IR can grow large, especially after aggressive unrolling. As experience has shown, it is generally better to use a plain walk over operations to match flat patterns but the current implementation is competitive nonetheless.
Definition at line 91 of file NestedMatcher.h.
using mlir::affine::ReductionLoopMap = typedef DenseMap<Operation *, SmallVector<LoopReduction, 2> > |
using mlir::affine::VectorizableLoopFun = typedef std::function<bool(AffineForOp)> |
Definition at line 88 of file LoopAnalysis.h.
LogicalResult mlir::affine::affineDataCopyGenerate | ( | AffineForOp | forOp, |
const AffineCopyOptions & | copyOptions, | ||
std::optional< Value > | filterMemRef, | ||
DenseSet< Operation * > & | copyNests | ||
) |
A convenience version of affineDataCopyGenerate for all ops in the body of an AffineForOp.
Definition at line 2464 of file LoopUtils.cpp.
References affineDataCopyGenerate().
LogicalResult mlir::affine::affineDataCopyGenerate | ( | Block::iterator | begin, |
Block::iterator | end, | ||
const AffineCopyOptions & | copyOptions, | ||
std::optional< Value > | filterMemRef, | ||
DenseSet< Operation * > & | copyNests | ||
) |
Performs explicit copying for the contiguous sequence of operations in the block iterator range [‘begin’, ‘end’), where ‘end’ can't be past the terminator of the block (since additional operations are potentially inserted right before end
.
copyOptions
provides various parameters, and the output argument copyNests
is the set of all copy nests inserted, each represented by its root affine.for. Since we generate alloc's and dealloc's for all fast buffers (before and after the range of operations resp. or at a hoisted position), all of the fast memory capacity is assumed to be available for processing this block range. When 'filterMemRef' is specified, copies are only generated for the provided MemRef. Returns success if the explicit copying succeeded for all memrefs on which affine load/stores were encountered. For memrefs for whose element types a size in bytes can't be computed (index
type), their capacity is not accounted for and the fastMemCapacityBytes
copy option would be non-functional in such cases.
Definition at line 2270 of file LoopUtils.cpp.
References mlir::detail::divideCeil(), mlir::Operation::emitError(), mlir::Operation::emitWarning(), mlir::affine::AffineCopyOptions::fastMemCapacityBytes, findHighestBlockForPlacement(), generateCopy(), getFullMemRefAsRegion(), mlir::Operation::getLoc(), getNestingDepth(), mlir::Block::getParentOp(), mlir::affine::AffineCopyOptions::slowMemorySpace, and mlir::Block::walk().
Referenced by affineDataCopyGenerate().
LogicalResult mlir::affine::affineForOpBodySkew | ( | AffineForOp | forOp, |
ArrayRef< uint64_t > | shifts, | ||
bool | unrollPrologueEpilogue = false |
||
) |
Skew the operations in an affine.for's body with the specified operation-wise shifts.
The shifts are with respect to the original execution order, and are multiplied by the loop 'step' before being applied. If unrollPrologueEpilogue
is set, fully unroll the prologue and epilogue loops when possible.
Definition at line 229 of file LoopUtils.cpp.
References mlir::applyOpPatternsGreedily(), mlir::config, mlir::ExistingOps, generateShiftedLoop(), getConstantTripCount(), mlir::Builder::getShiftedAffineMap(), isOpwiseShiftValid(), loopUnrollFull(), and mlir::patterns.
LogicalResult mlir::affine::affineParallelize | ( | AffineForOp | forOp, |
ArrayRef< LoopReduction > | parallelReductions = {} , |
||
AffineParallelOp * | resOp = nullptr |
||
) |
Replaces a parallel affine.for op with a 1-d affine.parallel op.
forOp
's body is taken by the affine.parallel op and the former is erased. (mlir::isLoopParallel can be used to detect a parallel affine.for op.) The reductions specified in parallelReductions
are also parallelized. Parallelization will fail in the presence of loop iteration arguments that are not listed in parallelReductions
. resOp
if non-null is set to the newly created affine.parallel op.
Definition at line 349 of file Utils.cpp.
References mlir::OpBuilder::create(), mlir::Value::getDefiningOp(), mlir::OpBuilder::getInsertionBlock(), mlir::OpBuilder::getInsertionPoint(), mlir::Operation::getOperand(), mlir::Block::getOperations(), mlir::Operation::getRegion(), mlir::Operation::getResult(), mlir::ValueRange::getTypes(), mlir::affine::LoopReduction::kind, mlir::Operation::setOperands(), mlir::Region::takeBody(), and mlir::affine::LoopReduction::value.
void mlir::affine::affineScalarReplace | ( | func::FuncOp | f, |
DominanceInfo & | domInfo, | ||
PostDominanceInfo & | postDomInfo, | ||
AliasAnalysis & | analysis | ||
) |
Replace affine store and load accesses by scalars by forwarding stores to loads and eliminate invariant affine loads; consequently, eliminate dead allocs.
Definition at line 1035 of file Utils.cpp.
References mlir::AliasAnalysis::alias(), mlir::Operation::erase(), findUnusedStore(), forwardStoreToLoad(), mlir::AliasResult::isNo(), loadCSE(), and mayAlias().
LogicalResult mlir::affine::boundCheckLoadOrStoreOp | ( | LoadOrStoreOpPointer | loadOrStoreOp, |
bool | emitError = true |
||
) |
Checks a load or store op for an out of bound access; returns failure if the access is out of bounds along any of the dimensions, success otherwise.
Emits a diagnostic error (with location information) if emitError is true.
void mlir::affine::buildAffineLoopNest | ( | OpBuilder & | builder, |
Location | loc, | ||
ArrayRef< int64_t > | lbs, | ||
ArrayRef< int64_t > | ubs, | ||
ArrayRef< int64_t > | steps, | ||
function_ref< void(OpBuilder &, Location, ValueRange)> | bodyBuilderFn = nullptr |
||
) |
Builds a perfect nest of affine.for loops, i.e., each loop except the innermost one contains only another loop and a terminator.
The loops iterate from "lbs" to "ubs" with "steps". The body of the innermost loop is populated by calling "bodyBuilderFn" and providing it with an OpBuilder, a Location and a list of loop induction variables.
Definition at line 2675 of file AffineOps.cpp.
References buildAffineLoopFromConstants(), and buildAffineLoopNestImpl().
Referenced by mlir::linalg::GenerateLoopNest< LoopTy >::doit().
void mlir::affine::buildAffineLoopNest | ( | OpBuilder & | builder, |
Location | loc, | ||
ValueRange | lbs, | ||
ValueRange | ubs, | ||
ArrayRef< int64_t > | steps, | ||
function_ref< void(OpBuilder &, Location, ValueRange)> | bodyBuilderFn = nullptr |
||
) |
Definition at line 2683 of file AffineOps.cpp.
References buildAffineLoopFromValues(), and buildAffineLoopNestImpl().
bool mlir::affine::buildSliceTripCountMap | ( | const ComputationSliceState & | slice, |
llvm::SmallDenseMap< Operation *, uint64_t, 8 > * | tripCountMap | ||
) |
Builds a map 'tripCountMap' from AffineForOp to constant trip count for loop nest surrounding represented by slice loop bounds in 'slice'.
Returns true on success, false otherwise (if a non-constant trip count was encountered).
Definition at line 1601 of file Utils.cpp.
References getConstantTripCount(), getConstDifference(), getForInductionVarOwner(), mlir::AffineMap::getNumResults(), mlir::affine::ComputationSliceState::ivs, mlir::affine::ComputationSliceState::lbs, and mlir::affine::ComputationSliceState::ubs.
Referenced by fuseLoops(), getComputationSliceState(), and getFusionComputeCost().
FusionResult mlir::affine::canFuseLoops | ( | AffineForOp | srcForOp, |
AffineForOp | dstForOp, | ||
unsigned | dstLoopDepth, | ||
ComputationSliceState * | srcSlice, | ||
FusionStrategy | fusionStrategy = FusionStrategy::Generic |
||
) |
Checks the feasibility of fusing the loop nest rooted at 'srcForOp' into the loop nest rooted at 'dstForOp' at 'dstLoopDepth'.
Returns FusionResult 'Success' if fusion of the src/dst loop nests is feasible (i.e. they are in the same block and dependences would not be violated). Otherwise returns a FusionResult explaining why fusion is not feasible. NOTE: This function is not feature complete and should only be used in testing.
Definition at line 249 of file LoopFusionUtils.cpp.
References computeSliceUnion(), mlir::affine::FusionResult::FailBlockDependence, mlir::affine::FusionResult::FailFusionDependence, mlir::affine::FusionResult::FailIncorrectSlice, mlir::affine::FusionResult::FailPrecondition, gatherLoadsAndStores(), mlir::affine::FusionStrategy::Generic, mlir::affine::SliceComputationResult::GenericFailure, getFusedLoopNestInsertionPoint(), getMaxLoopDepth(), getNumCommonSurroundingLoops(), mlir::affine::FusionStrategy::getSiblingFusionMemRef(), mlir::affine::FusionStrategy::getStrategy(), mlir::affine::SliceComputationResult::IncorrectSliceFailure, mlir::affine::FusionStrategy::ProducerConsumer, mlir::affine::FusionStrategy::Sibling, mlir::affine::FusionResult::Success, and mlir::affine::SliceComputationResult::value.
void mlir::affine::canonicalizeMapAndOperands | ( | AffineMap * | map, |
SmallVectorImpl< Value > * | operands | ||
) |
Modifies both map
and operands
in-place so as to:
Definition at line 1435 of file AffineOps.cpp.
Referenced by mlir::affine::FlatAffineValueConstraints::addBound(), augmentMapAndBounds(), canonicalizeLoopBounds(), composeAffineMapAndOperands(), composeMultiResultAffineMap(), mlir::affine::AffineValueMap::composeSimplifyAndCanonicalize(), createCanonicalizedAffineForOp(), mlir::affine::AffineValueMap::difference(), fuseLoops(), generatePointWiseCopy(), mlir::affine::MemRefAccess::getAccessMap(), getCleanupLoopLowerBound(), makeCanonicalAffineApplies(), materializeComputedBound(), reorderOperandsByHoistability(), replaceAllMemRefUsesWith(), replaceDimOrSym(), and simplifyConstrainedMinMaxOp().
void mlir::affine::canonicalizeSetAndOperands | ( | IntegerSet * | set, |
SmallVectorImpl< Value > * | operands | ||
) |
Canonicalizes an integer set the same way canonicalizeMapAndOperands does for affine maps.
Definition at line 1440 of file AffineOps.cpp.
Referenced by mlir::affine::FlatAffineValueConstraints::addAffineIfOpDomain(), and createSeparationCondition().
DependenceResult mlir::affine::checkMemrefAccessDependence | ( | const MemRefAccess & | srcAccess, |
const MemRefAccess & | dstAccess, | ||
unsigned | loopDepth, | ||
FlatAffineValueConstraints * | dependenceConstraints = nullptr , |
||
SmallVector< DependenceComponent, 2 > * | dependenceComponents = nullptr , |
||
bool | allowRAR = false |
||
) |
Definition at line 608 of file AffineAnalysis.cpp.
References addOrderingConstraints(), computeDirectionVector(), mlir::presburger::IntegerRelation::convertVarKind(), mlir::Operation::dump(), mlir::presburger::IntegerRelation::dump(), mlir::affine::DependenceResult::Failure, mlir::affine::MemRefAccess::getAccessRelation(), getAffineScope(), getCommonBlockInAffineScope(), mlir::presburger::IntegerRelation::getDomainSet(), getNumCommonLoops(), mlir::presburger::IntegerRelation::getNumDomainVars(), mlir::affine::DependenceResult::HasDependence, mlir::presburger::IntegerRelation::inverse(), mlir::presburger::IntegerRelation::isEmpty(), mlir::affine::MemRefAccess::memref, mlir::presburger::IntegerRelation::mergeAndCompose(), mlir::affine::DependenceResult::NoDependence, mlir::affine::MemRefAccess::opInst, and srcAppearsBeforeDstInAncestralBlock().
Referenced by computeSliceUnion(), getDependenceComponents(), getMaxLoopDepth(), isLoopMemoryParallel(), isTilingValid(), mayHaveEffect(), and mustReachAtInnermost().
LogicalResult mlir::affine::coalesceLoops | ( | MutableArrayRef< AffineForOp > | loops | ) |
Replace a perfect nest of "for" loops with a single linearized loop.
Assumes loops
contains a list of perfectly nested loops outermost to innermost that are normalized (step one and lower bound of zero) and with bounds and steps independent of any loop induction variable involved in the nest. Coalescing affine.for loops is not always possible, i.e., the result may not be representable using affine.for.
Definition at line 1611 of file LoopUtils.cpp.
References mlir::OpBuilder::create(), mlir::AffineExpr::floorDiv(), mlir::AffineMap::get(), mlir::Builder::getAffineDimExpr(), mlir::Builder::getAffineSymbolExpr(), mlir::Builder::getContext(), mlir::affine::AffineBound::getMap(), mlir::affine::AffineBound::getOperands(), mlir::AffineMap::getResults(), mlir::replaceAllUsesInRegionWith(), and mlir::OpBuilder::setInsertionPointToStart().
Referenced by coalescePerfectlyNestedAffineLoops().
LogicalResult mlir::affine::coalescePerfectlyNestedAffineLoops | ( | AffineForOp | op | ) |
Walk an affine.for to find a band to coalesce.
Definition at line 2739 of file LoopUtils.cpp.
References mlir::areValuesDefinedAbove(), coalesceLoops(), and getPerfectlyNestedLoops().
SliceComputationResult mlir::affine::computeSliceUnion | ( | ArrayRef< Operation * > | opsA, |
ArrayRef< Operation * > | opsB, | ||
unsigned | loopDepth, | ||
unsigned | numCommonLoops, | ||
bool | isBackwardSlice, | ||
ComputationSliceState * | sliceUnion | ||
) |
Computes in 'sliceUnion' the union of all slice bounds computed at 'loopDepth' between all dependent pairs of ops in 'opsA' and 'opsB', and then verifies if it is valid.
The parameter 'numCommonLoops' is the number of loops common to the operations in 'opsA' and 'opsB'. If 'isBackwardSlice' is true, computes slice bounds for loop nest surrounding ops in 'opsA', as a function of IVs and symbols of loop nest surrounding ops in 'opsB' at 'loopDepth'. If 'isBackwardSlice' is false, computes slice bounds for loop nest surrounding ops in 'opsB', as a function of IVs and symbols of loop nest surrounding ops in 'opsA' at 'loopDepth'. Returns 'SliceComputationResult::Success' if union was computed correctly, an appropriate 'failure' otherwise.
Returns 'SliceComputationResult::Success' if union was computed correctly, an appropriate failure otherwise.
Definition at line 1417 of file Utils.cpp.
References addMissingLoopIVBounds(), mlir::FlatLinearValueConstraints::areVarsAlignedWithOther(), checkMemrefAccessDependence(), mlir::affine::ComputationSliceState::clearBounds(), mlir::affine::FlatAffineValueConstraints::convertLoopIVSymbolsToDims(), mlir::affine::DependenceResult::Failure, mlir::affine::SliceComputationResult::GenericFailure, mlir::affine::ComputationSliceState::getAsConstraints(), getComputationSliceState(), getContext(), getInnermostCommonLoopDepth(), getNestingDepth(), mlir::presburger::IntegerRelation::getNumDimAndSymbolVars(), mlir::presburger::IntegerRelation::getNumDimVars(), mlir::presburger::IntegerRelation::getNumLocalVars(), mlir::FlatLinearConstraints::getSliceBounds(), mlir::FlatLinearValueConstraints::getValue(), mlir::FlatLinearValueConstraints::getValues(), mlir::affine::SliceComputationResult::IncorrectSliceFailure, mlir::affine::ComputationSliceState::insertPoint, mlir::affine::ComputationSliceState::isSliceValid(), mlir::affine::ComputationSliceState::ivs, mlir::affine::ComputationSliceState::lbOperands, mlir::affine::ComputationSliceState::lbs, mlir::affine::MemRefAccess::memref, mlir::FlatLinearValueConstraints::mergeAndAlignVarsWithOther(), mlir::affine::DependenceResult::NoDependence, mlir::affine::MemRefAccess::opInst, mlir::affine::SliceComputationResult::Success, mlir::affine::ComputationSliceState::ubOperands, mlir::affine::ComputationSliceState::ubs, mlir::FlatLinearValueConstraints::unionBoundingBox(), and mlir::affine::DependenceResult::value.
Referenced by canFuseLoops().
void mlir::affine::createAffineComputationSlice | ( | Operation * | opInst, |
SmallVectorImpl< AffineApplyOp > * | sliceOps | ||
) |
Given an operation, inserts one or more single result affine apply operations, results of which are exclusively used by this operation.
Given an operation, inserts one or more single result affine apply operations, results of which are exclusively used by this operation operation.
The operands of these newly created affine apply ops are guaranteed to be loop iterators or terminal symbols of a function.
Before
affine.for i = 0 to #map(N) idx = affine.apply (d0) -> (d0 mod 2) (i) send A[idx], ... v = "compute"(idx, ...)
After
affine.for i = 0 to #map(N) idx = affine.apply (d0) -> (d0 mod 2) (i) send A[idx], ... idx_ = affine.apply (d0) -> (d0 mod 2) (i) v = "compute"(idx_, ...) This allows the application of different transformations on send and compute (for eg. different shifts/delays)
Fills sliceOps
with the list of affine.apply operations. In the following cases, sliceOps
remains empty:
The operands of these newly created affine apply ops are guaranteed to be loop iterators or terminal symbols of a function.
Before
affine.for i = 0 to #map(N) idx = affine.apply (d0) -> (d0 mod 2) (i) "send"(idx, A, ...) "compute"(idx)
After
affine.for i = 0 to #map(N) idx = affine.apply (d0) -> (d0 mod 2) (i) "send"(idx, A, ...) idx_ = affine.apply (d0) -> (d0 mod 2) (i) "compute"(idx_)
This allows applying different transformations on send and compute (for eg. different shifts/delays).
Returns nullptr either if none of opInst's operands were the result of an affine.apply and thus there was no affine computation slice to create, or if all the affine.apply op's supplying operands to this opInst did not have any uses besides this opInst; otherwise returns the list of affine.apply operations created in output argument sliceOps
.
Definition at line 1482 of file Utils.cpp.
References mlir::OpBuilder::create(), fullyComposeAffineMapAndOperands(), mlir::AffineMap::get(), mlir::Operation::getLoc(), mlir::Builder::getMultiDimIdentityMap(), mlir::Operation::getNumOperands(), mlir::Operation::getOperands(), getReachableAffineApplyOps(), and mlir::Operation::setOperand().
std::unique_ptr< OperationPass< func::FuncOp > > mlir::affine::createAffineDataCopyGenerationPass | ( | ) |
Overload relying on pass options for initialization.
Definition at line 99 of file AffineDataCopyGeneration.cpp.
std::unique_ptr< OperationPass< func::FuncOp > > mlir::affine::createAffineDataCopyGenerationPass | ( | unsigned | slowMemorySpace, |
unsigned | fastMemorySpace, | ||
unsigned | tagMemorySpace = 0 , |
||
int | minDmaTransferSize = 1024 , |
||
uint64_t | fastMemCapacityBytes = std::numeric_limits<uint64_t>::max() |
||
) |
Performs packing (or explicit copying) of accessed memref regions into buffers in the specified faster memory space through either pointwise copies or DMA operations.
Generates copies for memref's living in 'slowMemorySpace' into newly created buffers in 'fastMemorySpace', and replaces memory operations to the former by the latter.
Only load op's handled for now.
Definition at line 91 of file AffineDataCopyGeneration.cpp.
std::unique_ptr< Pass > mlir::affine::createAffineExpandIndexOpsAsAffinePass | ( | ) |
Creates a pass to expand affine index operations into affine.apply operations.
Definition at line 95 of file AffineExpandIndexOpsAsAffine.cpp.
std::unique_ptr< Pass > mlir::affine::createAffineExpandIndexOpsPass | ( | ) |
Creates a pass to expand affine index operations into more fundamental operations (not necessarily restricted to Affine dialect).
Definition at line 214 of file AffineExpandIndexOps.cpp.
std::unique_ptr< OperationPass< func::FuncOp > > mlir::affine::createAffineLoopInvariantCodeMotionPass | ( | ) |
Creates a loop invariant code motion pass that hoists loop invariant operations out of affine loops.
Definition at line 207 of file AffineLoopInvariantCodeMotion.cpp.
std::unique_ptr< OperationPass< func::FuncOp > > mlir::affine::createAffineLoopNormalizePass | ( | bool | promoteSingleIter = false | ) |
Apply normalization transformations to affine loop-like ops.
If promoteSingleIter
is true, single iteration loops are promoted (i.e., the loop is replaced by its loop body).
Definition at line 53 of file AffineLoopNormalize.cpp.
std::unique_ptr< OperationPass< func::FuncOp > > mlir::affine::createAffineParallelizePass | ( | ) |
Creates a pass to convert all parallel affine.for's into 1-d affine.parallel ops.
Definition at line 95 of file AffineParallelize.cpp.
std::unique_ptr< OperationPass< func::FuncOp > > mlir::affine::createAffineScalarReplacementPass | ( | ) |
Creates a pass to replace affine memref accesses by scalars using store to load forwarding and redundant load elimination; consequently also eliminate dead allocs.
Definition at line 44 of file AffineScalarReplacement.cpp.
AffineForOp mlir::affine::createCanonicalizedAffineForOp | ( | OpBuilder | b, |
Location | loc, | ||
ValueRange | lbOperands, | ||
AffineMap | lbMap, | ||
ValueRange | ubOperands, | ||
AffineMap | ubMap, | ||
int64_t | step = 1 |
||
) |
Creates an AffineForOp while ensuring that the lower and upper bounds are canonicalized, i.e., unused and duplicate operands are removed, any constant operands propagated/folded in, and duplicate bound maps dropped.
Definition at line 2528 of file LoopUtils.cpp.
References canonicalizeMapAndOperands(), mlir::OpBuilder::create(), fullyComposeAffineMapAndOperands(), and mlir::removeDuplicateExprs().
Referenced by createFullTiles(), and generatePointWiseCopy().
std::unique_ptr< OperationPass< func::FuncOp > > mlir::affine::createLoopCoalescingPass | ( | ) |
Creates a pass that transforms perfectly nested loops with independent bounds into a single loop.
Definition at line 52 of file LoopCoalescing.cpp.
std::unique_ptr< Pass > mlir::affine::createLoopFusionPass | ( | unsigned | fastMemorySpace = 0 , |
uint64_t | localBufSizeThreshold = 0 , |
||
bool | maximalFusion = false , |
||
enum FusionMode | fusionMode = FusionMode::Greedy |
||
) |
Creates a loop fusion pass which fuses affine loop nests at the top-level of the operation the pass is created on according to the type of fusion specified in fusionMode
.
Buffers of size less than or equal to localBufSizeThreshold
are promoted to memory space fastMemorySpace
.
Definition at line 1408 of file LoopFusion.cpp.
std::unique_ptr< OperationPass< func::FuncOp > > mlir::affine::createLoopTilingPass | ( | ) |
Overload relying on pass options for initialization.
Definition at line 72 of file LoopTiling.cpp.
std::unique_ptr< OperationPass< func::FuncOp > > mlir::affine::createLoopTilingPass | ( | uint64_t | cacheSizeBytes | ) |
Creates a pass to perform tiling on loop nests.
Creates a pass to perform loop tiling on all suitable loop nests of a Function.
Definition at line 68 of file LoopTiling.cpp.
std::unique_ptr< OperationPass< func::FuncOp > > mlir::affine::createLoopUnrollAndJamPass | ( | int | unrollJamFactor = -1 | ) |
Creates a loop unroll jam pass to unroll jam by the specified factor.
A factor of -1 lets the pass use the default factor or the one on the command line if provided.
Definition at line 79 of file LoopUnrollAndJam.cpp.
std::unique_ptr< OperationPass< func::FuncOp > > mlir::affine::createLoopUnrollPass | ( | int | unrollFactor = -1 , |
bool | unrollUpToFactor = false , |
||
bool | unrollFull = false , |
||
const std::function< unsigned(AffineForOp)> & | getUnrollFactor = nullptr |
||
) |
Creates a loop unrolling pass with the provided parameters.
'getUnrollFactor' is a function callback for clients to supply a function that computes an unroll factor - the callback takes precedence over unroll factors supplied through other means. If -1 is passed as the unrollFactor and no callback is provided, anything passed from the command-line (if at all) or the default unroll factor is used (LoopUnroll:kDefaultUnrollFactor).
Definition at line 148 of file LoopUnroll.cpp.
std::unique_ptr< OperationPass< func::FuncOp > > mlir::affine::createPipelineDataTransferPass | ( | ) |
Creates a pass to pipeline explicit movement of data across levels of the memory hierarchy.
Definition at line 56 of file PipelineDataTransfer.cpp.
std::unique_ptr< OperationPass< func::FuncOp > > mlir::affine::createSimplifyAffineStructuresPass | ( | ) |
Creates a simplification pass for affine structures (maps and sets).
In addition, this pass also normalizes memrefs to have the trivial (identity) layout map.
Definition at line 85 of file SimplifyAffineStructures.cpp.
FailureOr< AffineApplyOp > mlir::affine::decompose | ( | RewriterBase & | rewriter, |
AffineApplyOp | op | ||
) |
Split an "affine.apply" operation into smaller ops.
This reassociates a large AffineApplyOp into an ordered list of smaller AffineApplyOps. This can be used right before lowering affine ops to arith to exhibit more opportunities for CSE and LICM. Return the sink AffineApplyOp on success or failure if op
does not decompose into smaller AffineApplyOps. Note that this can be undone by canonicalization which tries to maximally compose chains of AffineApplyOps.
Definition at line 96 of file DecomposeAffineOps.cpp.
|
inline |
Definition at line 92 of file NestedMatcher.h.
FailureOr< SmallVector< Value > > mlir::affine::delinearizeIndex | ( | OpBuilder & | b, |
Location | loc, | ||
Value | linearIndex, | ||
ArrayRef< OpFoldResult > | basis, | ||
bool | hasOuterBound = true |
||
) |
Definition at line 1976 of file Utils.cpp.
References composedAffineMultiply(), getDivMod(), mlir::Builder::getIndexAttr(), mlir::getValueOrCreateConstantIndexOp(), mlir::affine::DivModValue::quotient, and mlir::affine::DivModValue::remainder.
FailureOr< SmallVector< Value > > mlir::affine::delinearizeIndex | ( | OpBuilder & | b, |
Location | loc, | ||
Value | linearIndex, | ||
ArrayRef< Value > | basis, | ||
bool | hasOuterBound = true |
||
) |
Generate the IR to delinearize linearIndex
given the basis
and return the multi-index.
hasOuterBound
indicates whether basis
has an entry given the size of the first multi-index result - if it is true, the function will return basis.size()
values, otherwise, it will return basis.size() + 1
.
Definition at line 1946 of file Utils.cpp.
References composedAffineMultiply(), getDivMod(), mlir::Builder::getIndexAttr(), mlir::getValueOrCreateConstantIndexOp(), mlir::affine::DivModValue::quotient, and mlir::affine::DivModValue::remainder.
Referenced by mlir::linalg::unrollIndex().
mlir::Value mlir::affine::expandAffineExpr | ( | OpBuilder & | builder, |
Location | loc, | ||
AffineExpr | expr, | ||
ValueRange | dimValues, | ||
ValueRange | symbolValues | ||
) |
Emit code that computes the given affine expression using standard arithmetic operations applied to the provided dimension and symbol values.
Create a sequence of operations that implement the expr
applied to the given dimension and symbol values.
Definition at line 216 of file Utils.cpp.
Referenced by convertAffineApply().
std::optional< SmallVector< Value, 8 > > mlir::affine::expandAffineMap | ( | OpBuilder & | builder, |
Location | loc, | ||
AffineMap | affineMap, | ||
ValueRange | operands | ||
) |
Create a sequence of operations that implement the affineMap
applied to the given operands
(as it it were an AffineApplyOp).
Definition at line 226 of file Utils.cpp.
References mlir::AffineMap::getNumDims(), and mlir::AffineMap::getResults().
Referenced by lowerAffineMapMax(), and lowerAffineMapMin().
void mlir::affine::extractForInductionVars | ( | ArrayRef< AffineForOp > | forInsts, |
SmallVectorImpl< Value > * | ivs | ||
) |
Extracts the induction variables from a list of AffineForOps and places them in the output argument ivs
.
Extracts the induction variables from a list of AffineForOps and returns them.
Definition at line 2589 of file AffineOps.cpp.
Referenced by getFullMemRefAsRegion(), tilePerfectlyNested(), and tilePerfectlyNestedParametric().
void mlir::affine::extractInductionVars | ( | ArrayRef< Operation * > | affineOps, |
SmallVectorImpl< Value > & | ivs | ||
) |
Extracts the induction variables from a list of either AffineForOp or AffineParallelOp and places them in the output argument ivs
.
Definition at line 2596 of file AffineOps.cpp.
Referenced by getIndexSet().
void mlir::affine::fullyComposeAffineMapAndOperands | ( | AffineMap * | map, |
SmallVectorImpl< Value > * | operands | ||
) |
Given an affine map map
and its input operands
, this method composes into map
, maps of AffineApplyOps whose results are the values in operands
, iteratively until no more of operands
are the result of an AffineApplyOp.
When this function returns, map
becomes the composed affine map, and each Value in operands
is guaranteed to be either a loop IV or a terminal symbol, i.e., a symbol defined at the top level or a block/function argument.
Definition at line 1134 of file AffineOps.cpp.
References composeAffineMapAndOperands(), and mlir::Value::getDefiningOp().
Referenced by mlir::affine::FlatAffineValueConstraints::addBound(), composeMultiResultAffineMap(), mlir::affine::AffineValueMap::composeSimplifyAndCanonicalize(), createAffineComputationSlice(), createCanonicalizedAffineForOp(), mlir::affine::AffineValueMap::difference(), fullyComposeAndComputeConstantDelta(), generateCopy(), generatePointWiseCopy(), mlir::affine::MemRefAccess::getAccessMap(), getCleanupLoopLowerBound(), peelForLoop(), and replaceAllMemRefUsesWith().
FailureOr< int64_t > mlir::affine::fullyComposeAndComputeConstantDelta | ( | Value | value1, |
Value | value2 | ||
) |
Compute a constant delta of the given two values.
Return "failure" if we cannot determine a constant delta. value1
/value2
must be index-typed.
This function is similar to ValueBoundsConstraintSet::computeConstantDistance
. To work around limitations in FlatLinearConstraints
, this function fully composes value1
and value2
(if they are the result of affine.apply ops) before populating the constraint set. The folding/composing logic can see opportunities for simplifications that the constraint set implementation cannot see.
Definition at line 107 of file ValueBoundsOpInterfaceImpl.cpp.
References mlir::ValueBoundsConstraintSet::computeConstantBound(), mlir::presburger::EQ, fullyComposeAffineMapAndOperands(), mlir::AffineMap::get(), mlir::Builder::getAffineDimExpr(), mlir::Value::getContext(), mlir::Value::getType(), and mlir::Type::isIndex().
Referenced by mlir::vector::isDisjointTransferIndices().
void mlir::affine::fuseLoops | ( | AffineForOp | srcForOp, |
AffineForOp | dstForOp, | ||
const ComputationSliceState & | srcSlice, | ||
bool | isInnermostSiblingInsertionFusion = false |
||
) |
Fuses 'srcForOp' into 'dstForOp' with destination loop block insertion point and source slice loop bounds specified in 'srcSlice'.
isInnermostSiblingInsertionFusion
enables cleanup of ‘srcForOp that is a single-iteration reduction loop being sibling-fused into a 'dstForOp’.
Definition at line 426 of file LoopFusionUtils.cpp.
References buildSliceTripCountMap(), canonicalizeMapAndOperands(), mlir::OpBuilder::clone(), getForInductionVarOwner(), getSliceIterationCount(), mlir::affine::ComputationSliceState::insertPoint, isLoopParallelAndContainsReduction(), mlir::affine::ComputationSliceState::ivs, mlir::affine::ComputationSliceState::lbOperands, mlir::affine::ComputationSliceState::lbs, mlir::IRMapping::lookupOrNull(), promoteIfSingleIteration(), promoteSingleIterReductionLoop(), mlir::affine::ComputationSliceState::ubOperands, and mlir::affine::ComputationSliceState::ubs.
void mlir::affine::gatherLoops | ( | func::FuncOp | func, |
std::vector< SmallVector< AffineForOp, 2 >> & | depthToLoops | ||
) |
Gathers all AffineForOps in 'func.func' grouped by loop depth.
Definition at line 2516 of file LoopUtils.cpp.
References gatherLoopsInBlock().
void mlir::affine::gatherProducerConsumerMemrefs | ( | ArrayRef< Operation * > | srcOps, |
ArrayRef< Operation * > | dstOps, | ||
DenseSet< Value > & | producerConsumerMemrefs | ||
) |
Returns in 'producerConsumerMemrefs' the memrefs involved in a producer-consumer dependence between write ops in 'srcOps' and read ops in 'dstOps'.
Definition at line 643 of file LoopFusionUtils.cpp.
Referenced by gatherProducerConsumerMemrefs(), and getMaxLoopDepth().
LogicalResult mlir::affine::generateCopyForMemRegion | ( | const MemRefRegion & | memrefRegion, |
Operation * | analyzedOp, | ||
const AffineCopyOptions & | copyOptions, | ||
CopyGenerateResult & | result | ||
) |
generateCopyForMemRegion is similar to affineDataCopyGenerate, but works with a single memref region.
memrefRegion
is supposed to contain analysis information within analyzedOp. The generated prologue and epilogue always surround analyzedOp
.
Note that analyzedOp
is a single op for API convenience, and the [begin, end) version can be added as needed.
Also note that certain options in copyOptions
aren't looked at anymore, like slowMemorySpace.
Definition at line 2472 of file LoopUtils.cpp.
References mlir::affine::CopyGenerateResult::alloc, mlir::affine::CopyGenerateResult::copyNest, generateCopy(), mlir::Operation::getBlock(), mlir::affine::MemRefRegion::memref, and mlir::affine::CopyGenerateResult::sizeInBytes.
void mlir::affine::getAffineForIVs | ( | Operation & | op, |
SmallVectorImpl< AffineForOp > * | loops | ||
) |
Populates 'loops' with IVs of the affine.for ops surrounding 'op' ordered from the outermost 'affine.for' operation to the innermost one while not traversing outside of the surrounding affine scope.
Definition at line 633 of file Utils.cpp.
References mlir::Operation::getParentOp().
Referenced by findHighestBlockForPlacement(), getComputationSliceState(), getFullMemRefAsRegion(), getFusionComputeCost(), getInnermostCommonLoopDepth(), getLastDependentOpInRange(), mlir::affine::MemRefDependenceGraph::init(), insertBackwardComputationSlice(), isFusionProfitable(), and isOpLoopInvariant().
void mlir::affine::getAffineIVs | ( | Operation & | op, |
SmallVectorImpl< Value > & | ivs | ||
) |
Populates 'ivs' with IVs of the surrounding affine.for and affine.parallel ops ordered from the outermost one to the innermost while not traversing outside of the surrounding affine scope.
Definition at line 1876 of file Utils.cpp.
References mlir::Operation::getParentOp().
Referenced by mlir::affine::MemRefRegion::compute(), and getNumCommonSurroundingLoops().
AffineParallelOp mlir::affine::getAffineParallelInductionVarOwner | ( | Value | val | ) |
Returns true if the provided value is among the induction variables of an AffineParallelOp.
Definition at line 2576 of file AffineOps.cpp.
References mlir::Operation::getParentOp().
Referenced by mlir::affine::FlatAffineValueConstraints::addInductionVarOrTerminalSymbol(), mlir::affine::MemRefRegion::compute(), and isAffineParallelInductionVar().
Returns the closest region enclosing op
that is held by an operation with trait AffineScope
; nullptr
if there is no such region.
Definition at line 263 of file AffineOps.cpp.
References mlir::Operation::getParentRegion().
Referenced by checkMemrefAccessDependence(), isValidDim(), isValidSymbol(), mayHaveEffect(), mustReachAtInnermost(), verifyDimAndSymbolIdentifiers(), mlir::affine::AffineDmaStartOp::verifyInvariantsImpl(), mlir::affine::AffineDmaWaitOp::verifyInvariantsImpl(), and verifyMemoryOpIndexing().
void mlir::affine::getComputationSliceState | ( | Operation * | depSourceOp, |
Operation * | depSinkOp, | ||
FlatAffineValueConstraints * | dependenceConstraints, | ||
unsigned | loopDepth, | ||
bool | isBackwardSlice, | ||
ComputationSliceState * | sliceState | ||
) |
Computes the computation slice loop bounds for one loop nest as affine maps of the other loop nest's IVs and symbols, using 'dependenceConstraints' computed between 'depSourceAccess' and 'depSinkAccess'.
If 'isBackwardSlice' is true, a backwards slice is computed in which the slice bounds of loop nest surrounding 'depSourceAccess' are computed in terms of loop IVs and symbols of the loop nest surrounding 'depSinkAccess' at 'loopDepth'. If 'isBackwardSlice' is false, a forward slice is computed in which the slice bounds of loop nest surrounding 'depSinkAccess' are computed in terms of loop IVs and symbols of the loop nest surrounding 'depSourceAccess' at 'loopDepth'. The slice loop bounds and associated operands are returned in 'sliceState'.
Definition at line 1653 of file Utils.cpp.
References buildSliceTripCountMap(), getAffineForIVs(), mlir::Operation::getContext(), mlir::presburger::IntegerRelation::getNumDimAndSymbolVars(), getSequentialLoops(), mlir::FlatLinearConstraints::getSliceBounds(), getSliceIterationCount(), mlir::FlatLinearValueConstraints::getValue(), mlir::FlatLinearValueConstraints::getValues(), mlir::affine::ComputationSliceState::insertPoint, isLoopParallelAndContainsReduction(), mlir::affine::ComputationSliceState::isMaximal(), mlir::affine::ComputationSliceState::ivs, kSliceFusionBarrierAttrName, mlir::affine::ComputationSliceState::lbOperands, mlir::affine::ComputationSliceState::lbs, mlir::FlatLinearValueConstraints::projectOut(), mlir::affine::ComputationSliceState::ubOperands, and mlir::affine::ComputationSliceState::ubs.
Referenced by computeSliceUnion().
int64_t mlir::affine::getComputeCost | ( | AffineForOp | forOp, |
LoopNestStats & | stats | ||
) |
Computes the total cost of the loop nest rooted at 'forOp' using 'stats'.
Currently, the total cost is computed by counting the total operation instance count (i.e. total number of operations in the loop body * loop trip count) for the entire loop nest.
Definition at line 564 of file LoopFusionUtils.cpp.
References getComputeCostHelper().
Referenced by isFusionProfitable().
std::optional< uint64_t > mlir::affine::getConstantTripCount | ( | AffineForOp | forOp | ) |
Returns the trip count of the loop if it's a constant, std::nullopt otherwise.
This uses affine expression analysis and is able to determine constant trip count in non-trivial cases.
This method uses affine expression analysis (in turn using getTripCount) and is able to determine constant trip count in non-trivial cases.
Definition at line 91 of file LoopAnalysis.cpp.
References mlir::AffineMap::getResults(), getTripCountMapAndOperands(), and min().
Referenced by adjustToDivisorsOfTripCounts(), affineForOpBodySkew(), buildSliceTripCountMap(), constructTiledIndexSetHyperRect(), getLoopNestStats(), loopUnrollByFactor(), loopUnrollFull(), loopUnrollJamByFactor(), loopUnrollJamUpToFactor(), loopUnrollUpToFactor(), promoteIfSingleIteration(), and promoteSingleIterReductionLoop().
void mlir::affine::getDependenceComponents | ( | AffineForOp | forOp, |
unsigned | maxLoopDepth, | ||
std::vector< SmallVector< DependenceComponent, 2 >> * | depCompsVec | ||
) |
Returns in 'depCompsVec', dependence components for dependences between all load and store ops in loop nest rooted at 'forOp', at loop depths in range [1, maxLoopDepth].
Gathers dependence components for dependences between all ops in loop nest rooted at 'forOp' at loop depths in range [1, maxLoopDepth].
Definition at line 690 of file AffineAnalysis.cpp.
References checkMemrefAccessDependence(), and hasDependence().
Referenced by isValidLoopInterchangePermutation(), and sinkSequentialLoops().
DivModValue mlir::affine::getDivMod | ( | OpBuilder & | b, |
Location | loc, | ||
Value | lhs, | ||
Value | rhs | ||
) |
Create IR to calculate (div lhs, rhs) and (mod lhs, rhs).
Definition at line 1922 of file Utils.cpp.
References mlir::bindDims(), mlir::AffineExpr::floorDiv(), mlir::Builder::getContext(), makeComposedAffineApply(), mlir::affine::DivModValue::quotient, and mlir::affine::DivModValue::remainder.
Referenced by delinearizeIndex().
void mlir::affine::getEnclosingAffineOps | ( | Operation & | op, |
SmallVectorImpl< Operation * > * | ops | ||
) |
Populates 'ops' with affine operations enclosing op
ordered from outermost to innermost while stopping at the boundary of the affine scope.
affine.for, affine.if, or affine.parallel ops comprise such surrounding affine ops. ops
is guaranteed by design to have a successive chain of affine parent ops.
Definition at line 647 of file Utils.cpp.
References mlir::Operation::getParentOp(), and mlir::Operation::hasTrait().
Referenced by getOpIndexSet().
AffineForOp mlir::affine::getForInductionVarOwner | ( | Value | val | ) |
Returns the loop parent of an induction variable.
If the provided value is not an induction variable, then return nullptr.
Definition at line 2565 of file AffineOps.cpp.
Referenced by mlir::affine::FlatAffineValueConstraints::addDomainFromSliceMaps(), mlir::affine::FlatAffineValueConstraints::addInductionVarOrTerminalSymbol(), addMissingLoopIVBounds(), mlir::affine::FlatAffineValueConstraints::addSliceBounds(), buildPackingLoopNestImpl(), buildSliceTripCountMap(), mlir::affine::MemRefRegion::compute(), mlir::affine::FlatAffineValueConstraints::convertLoopIVSymbolsToDims(), fuseLoops(), mlir::affine::ComputationSliceState::getAsConstraints(), getLargestKnownDivisor(), getLowerBound(), getNumCommonLoops(), mlir::affine::ComputationSliceState::getSourceAsConstraints(), getUpperBound(), isAffineForInductionVar(), mlir::affine::ComputationSliceState::isMaximal(), isNonNegativeBoundedBy(), isUniformDefinition(), and replaceByPackingResult().
bool mlir::affine::getFusionComputeCost | ( | AffineForOp | srcForOp, |
LoopNestStats & | srcStats, | ||
AffineForOp | dstForOp, | ||
LoopNestStats & | dstStats, | ||
const ComputationSliceState & | slice, | ||
int64_t * | computeCost | ||
) |
Computes and returns in 'computeCost', the total compute cost of fusing the 'slice' of the loop nest rooted at 'srcForOp' into 'dstForOp'.
Currently, the total cost is computed by counting the total operation instance count (i.e. total number of operations in the loop body * loop trip count) for the entire loop nest. Returns true on success, failure otherwise (e.g. non-constant trip counts).
Currently, the total cost is computed by counting the total operation instance count (i.e. total number of operations in the loop body * loop trip count) for the entire loop nest.
Definition at line 575 of file LoopFusionUtils.cpp.
References buildSliceTripCountMap(), getAffineForIVs(), getComputeCostHelper(), getSliceIterationCount(), mlir::Operation::getUsers(), and mlir::affine::ComputationSliceState::insertPoint.
Referenced by isFusionProfitable().
LogicalResult mlir::affine::getIndexSet | ( | MutableArrayRef< Operation * > | ops, |
FlatAffineValueConstraints * | domain | ||
) |
Builds a system of constraints with dimensional variables corresponding to the loop IVs of the forOps and AffineIfOp's operands appearing in that order.
Bounds of the loop are used to add appropriate inequalities. Constraints from the index sets of AffineIfOp are also added. Any symbols founds in the bound operands are added as symbols in the system. Returns failure for the yet unimplemented cases. ops
accepts both AffineForOp and AffineIfOp.
Definition at line 242 of file AffineAnalysis.cpp.
References mlir::affine::FlatAffineValueConstraints::addAffineForOpDomain(), mlir::affine::FlatAffineValueConstraints::addAffineIfOpDomain(), mlir::affine::FlatAffineValueConstraints::addAffineParallelOpDomain(), and extractInductionVars().
Referenced by checkIfHyperRectangular(), createFullTiles(), createSeparationCondition(), and getOpIndexSet().
unsigned mlir::affine::getInnermostCommonLoopDepth | ( | ArrayRef< Operation * > | ops, |
SmallVectorImpl< AffineForOp > * | surroundingLoops = nullptr |
||
) |
Returns the innermost common loop depth for the set of operations in 'ops'.
Definition at line 1385 of file Utils.cpp.
References getAffineForIVs(), max(), and min().
Referenced by computeSliceUnion(), and getMaxLoopDepth().
std::optional< uint64_t > mlir::affine::getIntOrFloatMemRefSizeInBytes | ( | MemRefType | memRefType | ) |
Returns the size of a memref with element type int or float in bytes if it's statically shaped, std::nullopt otherwise.
Returns the size of memref data in bytes if it's statically shaped, std::nullopt otherwise.
If the element of the memref has vector type, takes into account size of the vector as well.
Definition at line 1246 of file Utils.cpp.
References getMemRefIntOrFloatEltSizeInBytes().
Referenced by generateCopy().
Given an induction variable iv
of type AffineForOp and indices
of type IndexType, returns the set of indices
that are independent of iv
.
Prerequisites (inherited from isAccessInvariant
above):
iv
and indices
of the proper type;indices
;Emits a note if it encounters a chain of affine.apply and conservatively those cases.
Definition at line 182 of file LoopAnalysis.cpp.
References isAccessIndexInvariant().
Referenced by makePermutationMap().
uint64_t mlir::affine::getLargestDivisorOfTripCount | ( | AffineForOp | forOp | ) |
Returns the greatest known integral divisor of the trip count.
Affine expression analysis is used (indirectly through getTripCount), and this method is thus able to determine non-trivial divisors.
Definition at line 117 of file LoopAnalysis.cpp.
References mlir::AffineMap::getNumResults(), mlir::AffineMap::getResults(), getTripCountMapAndOperands(), and max().
Referenced by constructTiledIndexSetHyperRect(), loopUnrollByFactor(), and loopUnrollJamByFactor().
bool mlir::affine::getLoopNestStats | ( | AffineForOp | forOpRoot, |
LoopNestStats * | stats | ||
) |
Collect loop nest statistics (eg.
loop trip count and operation count) in 'stats' for loop nest rooted at 'forOp'. Returns true on success, returns false otherwise.
Definition at line 475 of file LoopFusionUtils.cpp.
References mlir::WalkResult::advance(), getConstantTripCount(), mlir::WalkResult::interrupt(), mlir::affine::LoopNestStats::loopMap, mlir::affine::LoopNestStats::opCountMap, and mlir::affine::LoopNestStats::tripCountMap.
Referenced by isFusionProfitable().
std::optional< int64_t > mlir::affine::getMemoryFootprintBytes | ( | AffineForOp | forOp, |
int | memorySpace = -1 |
||
) |
Gets the memory footprint of all data touched in the specified memory space in bytes; if the memory space is unspecified, considers all memory spaces.
Definition at line 1953 of file Utils.cpp.
References getMemoryFootprintBytes().
Referenced by isFusionProfitable().
std::optional< int64_t > mlir::affine::getMemRefIntOrFloatEltSizeInBytes | ( | MemRefType | memRefType | ) |
Returns the memref's element type's size in bytes where the elemental type is an int or float or a vector of such types.
Definition at line 1196 of file Utils.cpp.
References mlir::detail::divideCeil().
Referenced by createPrivateMemRef(), getIntOrFloatMemRefSizeInBytes(), and mlir::affine::MemRefRegion::getRegionSize().
unsigned mlir::affine::getNestingDepth | ( | Operation * | op | ) |
Returns the nesting depth of this operation, i.e., the number of loops surrounding this operation.
Returns the nesting depth of this statement, i.e., the number of loops surrounding this statement.
Definition at line 1848 of file Utils.cpp.
References mlir::Operation::getParentOp().
Referenced by affineDataCopyGenerate(), computeSliceUnion(), getMemoryFootprintBytes(), and isLoopMemoryParallel().
Returns the number of surrounding loops common to both A and B.
Returns the number of surrounding loops common to 'loopsA' and 'loopsB', where each lists loops from outer-most to inner-most in loop nest.
Definition at line 1893 of file Utils.cpp.
References getAffineIVs(), and min().
Referenced by canFuseLoops(), getMaxLoopDepth(), hasNoInterveningEffect(), mayHaveEffect(), and mustReachAtInnermost().
void mlir::affine::getPerfectlyNestedLoops | ( | SmallVectorImpl< AffineForOp > & | nestedLoops, |
AffineForOp | root | ||
) |
Get perfectly nested sequence of loops starting at root of loop nest (the first op being another AffineFor, and the second op - a terminator).
A loop is perfectly nested iff: the first op in the loop's body is another AffineForOp, and the second op is a terminator).
Definition at line 856 of file LoopUtils.cpp.
References mlir::Block::begin(), mlir::Block::end(), mlir::Block::front(), and max().
Referenced by coalescePerfectlyNestedAffineLoops(), getTileableBands(), and sinkSequentialLoops().
void mlir::affine::getReachableAffineApplyOps | ( | ArrayRef< Value > | operands, |
SmallVectorImpl< Operation * > & | affineApplyOps | ||
) |
Returns in affineApplyOps
, the sequence of those AffineApplyOp Operations that are reachable via a search starting from operands
and ending at those operands that are not the result of an AffineApplyOp.
Returns the sequence of AffineApplyOp Operations operation in 'affineApplyOps', which are reachable via a search starting from 'operands', and ending at operands which are not defined by AffineApplyOps.
Definition at line 192 of file AffineAnalysis.cpp.
Referenced by createAffineComputationSlice().
LogicalResult mlir::affine::getRelationFromMap | ( | AffineMap & | map, |
presburger::IntegerRelation & | rel | ||
) |
Builds a relation from the given AffineMap/AffineValueMap map
, containing all pairs of the form operands -> result
that satisfy map
.
rel
is set to the relation built. For example, give the AffineMap:
(d0, d1)[s0] -> (d0 + s0, d0 - s0)
the resulting relation formed is:
(d0, d1) -> (r1, r2) [d0 d1 r1 r2 s0 const] 1 0 -1 0 1 0 = 0 0 1 0 -1 -1 0 = 0
For AffineValueMap, the domain and symbols have Value set corresponding to the Value in map
. Returns failure if the AffineMap could not be flattened (i.e., semi-affine is not yet handled).
Definition at line 492 of file AffineStructures.cpp.
References mlir::presburger::IntegerRelation::addEquality(), mlir::FlatLinearValueConstraints::appendDimVar(), mlir::getFlattenedAffineExprs(), mlir::presburger::IntegerRelation::getNumCols(), mlir::presburger::IntegerRelation::getNumDimAndSymbolVars(), mlir::AffineMap::getNumDims(), mlir::presburger::IntegerRelation::getNumDimVars(), mlir::AffineMap::getNumResults(), and mlir::FlatLinearValueConstraints::setValue().
Referenced by mlir::affine::MemRefAccess::getAccessRelation(), and getRelationFromMap().
LogicalResult mlir::affine::getRelationFromMap | ( | const AffineValueMap & | map, |
presburger::IntegerRelation & | rel | ||
) |
Definition at line 532 of file AffineStructures.cpp.
References mlir::affine::AffineValueMap::getAffineMap(), mlir::AffineMap::getNumDims(), mlir::presburger::IntegerRelation::getNumDimVars(), mlir::AffineMap::getNumResults(), mlir::presburger::IntegerRelation::getNumSymbolVars(), mlir::affine::AffineValueMap::getOperand(), getRelationFromMap(), and mlir::presburger::IntegerRelation::setId().
void mlir::affine::getSequentialLoops | ( | AffineForOp | forOp, |
llvm::SmallDenseSet< Value, 8 > * | sequentialLoops | ||
) |
Returns in 'sequentialLoops' all sequential loops in loop nest rooted at 'forOp'.
Definition at line 1971 of file Utils.cpp.
References isLoopParallel().
Referenced by getComputationSliceState().
uint64_t mlir::affine::getSliceIterationCount | ( | const llvm::SmallDenseMap< Operation *, uint64_t, 8 > & | sliceTripCountMap | ) |
Return the number of iterations for the slicetripCountMap
provided.
Definition at line 1639 of file Utils.cpp.
Referenced by fuseLoops(), getComputationSliceState(), and getFusionComputeCost().
void mlir::affine::getSupportedReductions | ( | AffineForOp | forOp, |
SmallVectorImpl< LoopReduction > & | supportedReductions | ||
) |
Populate supportedReductions
with descriptors of the supported reductions.
Definition at line 85 of file AffineAnalysis.cpp.
References getSupportedReduction().
Referenced by isLoopParallel(), and loopUnrollJamByFactor().
void mlir::affine::getTileableBands | ( | func::FuncOp | f, |
std::vector< SmallVector< AffineForOp, 6 >> * | bands | ||
) |
Identify valid and profitable bands of loops to tile.
This is currently just a temporary placeholder to test the mechanics of tiled code generation. Returns all maximal outermost perfect loop nests to tile.
Definition at line 873 of file LoopUtils.cpp.
References getPerfectlyNestedLoops().
void mlir::affine::getTripCountMapAndOperands | ( | AffineForOp | forOp, |
AffineMap * | tripCountMap, | ||
SmallVectorImpl< Value > * | tripCountOperands | ||
) |
Returns the trip count of the loop as an affine map with its corresponding operands if the latter is expressible as an affine expression, and nullptr otherwise.
Returns the trip count of the loop as an affine expression if the latter is expressible as an affine expression, and nullptr otherwise.
This method always succeeds as long as the lower bound is not a multi-result map. The trip count expression is simplified before returning. This method only utilizes map composition to construct lower and upper bounds before computing the trip count expressions
The trip count expression is simplified before returning. This method only utilizes map composition to construct lower and upper bounds before computing the trip count expressions.
Definition at line 41 of file LoopAnalysis.cpp.
References mlir::AffineExpr::ceilDiv(), mlir::affine::AffineValueMap::difference(), mlir::AffineMap::get(), mlir::affine::AffineValueMap::getAffineMap(), mlir::AffineMap::getConstantMap(), mlir::affine::AffineValueMap::getNumResults(), mlir::affine::AffineValueMap::getOperands(), mlir::affine::AffineValueMap::getResult(), and mlir::affine::AffineValueMap::setResult().
Referenced by getCleanupLoopLowerBound(), getConstantTripCount(), and getLargestDivisorOfTripCount().
|
inline |
Utility function that returns true if the provided DependenceResult corresponds to a dependence result.
Definition at line 179 of file AffineAnalysis.h.
References mlir::affine::DependenceResult::HasDependence, and mlir::affine::DependenceResult::value.
Referenced by getDependenceComponents(), getMaxLoopDepth(), isTilingValid(), and mustReachAtInnermost().
bool mlir::affine::hasNoInterveningEffect | ( | Operation * | start, |
T | memOp, | ||
llvm::function_ref< bool(Value, Value)> | mayAlias | ||
) |
Ensure that all operations that could be executed after start
(noninclusive) and prior to memOp
(e.g.
on a control flow/op path between the operations) do not have the potential memory effect EffectType
on memOp
. memOp
is an operation that reads or writes to a memref. For example, if EffectType
is MemoryEffects::Write, this method will check if there is no write to the memory between start
and memOp
that would change the read within memOp
.
Definition at line 682 of file Utils.cpp.
References mlir::Block::end(), mlir::Operation::getBlock(), getNumCommonSurroundingLoops(), mlir::Operation::getParentOp(), mlir::Operation::getParentRegion(), mlir::Region::getParentRegion(), mlir::Block::getSuccessors(), mlir::Block::getTerminator(), mlir::Operation::isAncestor(), mlir::Region::isAncestor(), mayAlias(), and mayHaveEffect().
LogicalResult mlir::affine::hoistAffineIfOp | ( | AffineIfOp | ifOp, |
bool * | folded = nullptr |
||
) |
Hoists out affine.if/else to as high as possible, i.e., past all invariant affine.fors/parallel's.
Returns success if any hoisting happened; folded` is set to true if the op was folded or erased. This hoisting could lead to significant code expansion in some cases.
Definition at line 411 of file Utils.cpp.
References mlir::applyOpPatternsGreedily(), mlir::applyPatternsGreedily(), mlir::config, mlir::ExistingOps, getOutermostInvariantForOp(), and mlir::patterns.
AffineForOp mlir::affine::insertBackwardComputationSlice | ( | Operation * | srcOpInst, |
Operation * | dstOpInst, | ||
unsigned | dstLoopDepth, | ||
ComputationSliceState * | sliceState | ||
) |
Creates a clone of the computation contained in the loop nest surrounding 'srcOpInst', slices the iteration space of src loop based on slice bounds in 'sliceState', and inserts the computation slice at the beginning of the operation block of the loop at 'dstLoopDepth' in the loop nest surrounding 'dstOpInst'.
Creates a computation slice of the loop nest surrounding 'srcOpInst', updates the slice loop bounds with any non-null bound maps specified in 'sliceState', and inserts this slice into the loop nest surrounding 'dstOpInst' at loop depth 'dstLoopDepth'.
Returns the top-level loop of the computation slice on success, returns nullptr otherwise.
Definition at line 1767 of file Utils.cpp.
References mlir::OpBuilder::clone(), mlir::Operation::emitError(), findInstPosition(), getAffineForIVs(), getInstAtPosition(), mlir::affine::ComputationSliceState::lbOperands, mlir::affine::ComputationSliceState::lbs, mlir::affine::ComputationSliceState::ubOperands, and mlir::affine::ComputationSliceState::ubs.
void mlir::affine::interchangeLoops | ( | AffineForOp | forOpA, |
AffineForOp | forOpB | ||
) |
Performs loop interchange on 'forOpA' and 'forOpB'.
Performs loop interchange on 'forOpA' and 'forOpB', where 'forOpB' is nested within 'forOpA' as the only non-terminator operation in its block.
Requires that 'forOpA' and 'forOpB' are part of a perfectly nested sequence of loops.
Definition at line 1294 of file LoopUtils.cpp.
bool mlir::affine::isAffineForInductionVar | ( | Value | val | ) |
Returns true if the provided value is the induction variable of an AffineForOp.
Returns true if the provided value is the induction variable of a AffineForOp.
Definition at line 2553 of file AffineOps.cpp.
References getForInductionVarOwner().
Referenced by addMissingLoopIVBounds(), getNumCommonLoops(), isAccessIndexInvariant(), and isAffineInductionVar().
bool mlir::affine::isAffineInductionVar | ( | Value | val | ) |
Returns true if the provided value is the induction variable of an AffineForOp or AffineParallelOp.
Definition at line 2561 of file AffineOps.cpp.
References isAffineForInductionVar(), and isAffineParallelInductionVar().
Referenced by mlir::affine::FlatAffineValueConstraints::addInductionVarOrTerminalSymbol(), and mlir::affine::MemRefRegion::compute().
bool mlir::affine::isAffineParallelInductionVar | ( | Value | val | ) |
Returns true if val
is the induction variable of an AffineParallelOp.
Definition at line 2557 of file AffineOps.cpp.
References getAffineParallelInductionVarOwner().
Referenced by getNumCommonLoops(), and isAffineInductionVar().
bool mlir::affine::isContiguousAccess | ( | Value | iv, |
LoadOrStoreOp | memoryOp, | ||
int * | memRefDim | ||
) |
Given:
iv
of type AffineForOp;memoryOp
of type const LoadOp& or const StoreOp&; determines whether memoryOp
has a contiguous access along iv
. Contiguous is defined as either invariant or varying only along a unique MemRef dim. Upon success, the unique MemRef dim is written in memRefDim
(or -1 to convey the memRef access is invariant along iv
).Prerequisites:
memRefDim
~= nullptr;iv
of the proper type;memoryOp
has no layout map or at most an identity layout map.Currently only supports no layout map or identity layout map in the memref. Returns false if the memref has a non-identity layoutMap. This behavior is conservative.
Definition at line 195 of file LoopAnalysis.cpp.
References isAccessIndexInvariant().
Referenced by isVectorizableLoopBody().
bool mlir::affine::isInvariantAccess | ( | LoadOrStoreOp | memOp, |
AffineForOp | forOp | ||
) |
Checks if an affine read or write operation depends on forOp
's IV, i.e., if the memory access is invariant on forOp
.
Definition at line 168 of file LoopAnalysis.cpp.
References mlir::affine::AffineValueMap::composeSimplifyAndCanonicalize(), and mlir::affine::AffineValueMap::getOperands().
bool mlir::affine::isLoopMemoryParallel | ( | AffineForOp | forOp | ) |
Returns true if ‘forOp’ doesn't have memory dependences preventing parallelization.
Memrefs that are allocated inside forOp
do not impact its dependences and parallelism. This function does not check iter_args (for values other than memref types) and should be used only as a building block for complete parallelism-checking functions.
Definition at line 139 of file AffineAnalysis.cpp.
References mlir::WalkResult::advance(), checkMemrefAccessDependence(), getNestingDepth(), mlir::WalkResult::interrupt(), isLocallyDefined(), mlir::isMemoryEffectFree(), mlir::affine::DependenceResult::NoDependence, and mlir::affine::DependenceResult::value.
Referenced by isLoopParallel().
bool mlir::affine::isLoopParallel | ( | AffineForOp | forOp, |
SmallVectorImpl< LoopReduction > * | parallelReductions = nullptr |
||
) |
Returns true if ‘forOp’ is a parallel loop.
If parallelReductions
is provided, populates it with descriptors of the parallelizable reductions and treats them as not preventing parallelization.
Definition at line 101 of file AffineAnalysis.cpp.
References getSupportedReductions(), and isLoopMemoryParallel().
Referenced by getSequentialLoops(), and isLoopParallelAndContainsReduction().
bool mlir::affine::isLoopParallelAndContainsReduction | ( | AffineForOp | forOp | ) |
Returns whether a loop is a parallel loop and contains a reduction loop.
Returns whether a loop is parallel and contains a reduction loop.
Definition at line 1962 of file Utils.cpp.
References isLoopParallel().
Referenced by fuseLoops(), and getComputationSliceState().
bool mlir::affine::isOpwiseShiftValid | ( | AffineForOp | forOp, |
ArrayRef< uint64_t > | shifts | ||
) |
Checks where SSA dominance would be violated if a for op's body operations are shifted by the specified shifts.
Checks whether SSA dominance would be violated if a for op's body operations are shifted by the specified shifts.
This method checks if a 'def' and all its uses have the same shift factor.
Definition at line 360 of file LoopAnalysis.cpp.
References mlir::detail::enumerate(), and mlir::Value::getUsers().
Referenced by affineForOpBodySkew().
bool LLVM_ATTRIBUTE_UNUSED mlir::affine::isPerfectlyNested | ( | ArrayRef< AffineForOp > | loops | ) |
Returns true if loops
is a perfectly nested loop nest, where loops appear in it from outermost to innermost.
Definition at line 1365 of file LoopUtils.cpp.
Referenced by performPreTilingChecks(), and permuteLoops().
bool mlir::affine::isTilingValid | ( | ArrayRef< AffineForOp > | loops | ) |
Checks whether hyper-rectangular loop tiling of the nest represented by loops
is valid.
The validity condition is from Irigoin and Triolet, which states that two tiles cannot depend on each other. We simplify such condition to just checking whether there is any negative dependence direction, since we have the prior knowledge that the tiling results will be hyper-rectangles, which are scheduled in the lexicographically increasing order on the vector of loop indices. This function will return failure when any dependence component is negative along any of loops
.
Definition at line 397 of file LoopAnalysis.cpp.
References checkMemrefAccessDependence(), mlir::Operation::dump(), hasDependence(), and mlir::affine::MemRefAccess::opInst.
bool mlir::affine::isTopLevelValue | ( | Value | value | ) |
A utility function to check if a value is defined at the top level of an op with trait AffineScope
or is a region argument for such an op.
A utility function to check if a value is defined at the top level of an op with trait AffineScope
.
A value of index type defined at the top level is always a valid symbol for all its uses.
If the value is defined in an unlinked region, conservatively assume it is not top-level. A value of index type defined at the top level is always a valid symbol.
Definition at line 248 of file AffineOps.cpp.
References mlir::Value::getDefiningOp(), mlir::Operation::getParentOp(), and mlir::Operation::hasTrait().
Referenced by mlir::affine::FlatAffineValueConstraints::addInductionVarOrTerminalSymbol(), isDimOpValidSymbol(), isValidDim(), isValidSymbol(), and remainsLegalAfterInline().
A utility function to check if a value is defined at the top level of region
or is an argument of region
.
A value of index type defined at the top level of a AffineScope
region is always a valid symbol for all uses in that region.
Definition at line 48 of file AffineOps.cpp.
References mlir::Value::getDefiningOp(), and mlir::Operation::getParentRegion().
bool mlir::affine::isValidDim | ( | Value | value | ) |
Returns true if the given Value can be used as a dimension id in the region of the closest surrounding op that has the trait AffineScope
.
Definition at line 278 of file AffineOps.cpp.
References getAffineScope(), mlir::Value::getDefiningOp(), mlir::Value::getType(), and mlir::Type::isIndex().
Referenced by isValidAffineIndexOperand(), remainsLegalAfterInline(), replaceAllMemRefUsesWith(), transformMemRefLoadWithReducedRank(), and verifyDimAndSymbolIdentifiers().
Returns true if the given Value can be used as a dimension id in region
, i.e., for all its uses in region
.
Definition at line 298 of file AffineOps.cpp.
References mlir::Value::getDefiningOp(), mlir::Operation::getParentOp(), mlir::Value::getType(), mlir::Type::isIndex(), isTopLevelValue(), and isValidSymbol().
bool mlir::affine::isValidLoopInterchangePermutation | ( | ArrayRef< AffineForOp > | loops, |
ArrayRef< unsigned > | loopPermMap | ||
) |
Checks if the loop interchange permutation 'loopPermMap', of the perfectly nested sequence of loops in 'loops', would violate dependences (loop 'i' in 'loops' is mapped to location 'j = 'loopPermMap[i]' in the interchange).
Checks if the loop interchange permutation 'loopPermMap' of the perfectly nested sequence of loops in 'loops' would violate dependences.
Definition at line 1351 of file LoopUtils.cpp.
References checkLoopInterchangeDependences(), and getDependenceComponents().
bool mlir::affine::isValidSymbol | ( | Value | value | ) |
Returns true if the given value can be used as a symbol in the region of the closest surrounding op that has the trait AffineScope
.
Definition at line 392 of file AffineOps.cpp.
References getAffineScope(), mlir::Value::getDefiningOp(), mlir::Value::getType(), mlir::Type::isIndex(), and isTopLevelValue().
Referenced by canonicalizePromotedSymbols(), mlir::affine::MemRefRegion::compute(), mlir::affine::ComputationSliceState::getAsConstraints(), isMemRefSizeValidSymbol(), isValidAffineIndexOperand(), isValidDim(), isValidSymbol(), remainsLegalAfterInline(), replaceAllMemRefUsesWith(), transformMemRefLoadWithReducedRank(), and verifyDimAndSymbolIdentifiers().
Returns true if the given Value can be used as a symbol for region
, i.e., for all its uses in region
.
A value can be used as a symbol for region
iff it meets one of the following conditions: *) It is a constant.
*) It is the result of an affine apply operation with symbol arguments. *) It is a result of the dim op on a memref whose corresponding size is a valid symbol. *) It is defined at the top level of 'region' or is its argument. *) It dominates region
's parent op. If region
is null, conservatively assume the symbol definition scope does not exist and only accept the values that would be symbols regardless of the surrounding region structure, i.e. the first three cases above.
Definition at line 421 of file AffineOps.cpp.
References mlir::Value::getDefiningOp(), mlir::Region::getParentOp(), mlir::Operation::getParentRegion(), mlir::Value::getType(), mlir::Operation::hasTrait(), isDimOpValidSymbol(), mlir::Type::isIndex(), isTopLevelValue(), isValidSymbol(), mlir::m_Constant(), and mlir::matchPattern().
bool mlir::affine::isVectorizableLoopBody | ( | AffineForOp | loop, |
int * | memRefDim, | ||
NestedPattern & | vectorTransferMatcher | ||
) |
Checks whether the loop is structurally vectorizable and that all the LoadOp and StoreOp matched have access indexing functions that are either:
memRefDim
. Definition at line 324 of file LoopAnalysis.cpp.
References isContiguousAccess(), and isVectorizableLoopBodyWithOpCond().
bool mlir::affine::isVectorizableLoopBody | ( | AffineForOp | loop, |
NestedPattern & | vectorTransferMatcher | ||
) |
Checks whether the loop is structurally vectorizable; i.e.
:
Definition at line 350 of file LoopAnalysis.cpp.
References isVectorizableLoopBodyWithOpCond().
Referenced by isVectorizableLoopPtrFactory().
OpFoldResult mlir::affine::linearizeIndex | ( | ArrayRef< OpFoldResult > | multiIndex, |
ArrayRef< OpFoldResult > | basis, | ||
ImplicitLocOpBuilder & | builder | ||
) |
Definition at line 2006 of file Utils.cpp.
References mlir::ImplicitLocOpBuilder::getLoc().
Referenced by mlir::mesh::createProcessLinearIndex(), mlir::spirv::getOpenCLElementPtr(), and mlir::spirv::getVulkanElementPtr().
OpFoldResult mlir::affine::linearizeIndex | ( | OpBuilder & | builder, |
Location | loc, | ||
ArrayRef< OpFoldResult > | multiIndex, | ||
ArrayRef< OpFoldResult > | basis | ||
) |
Definition at line 2012 of file Utils.cpp.
References mlir::computeLinearIndex(), mlir::computeStrides(), mlir::getAffineConstantExpr(), mlir::getAffineSymbolExpr(), mlir::Builder::getContext(), mlir::Builder::getIndexAttr(), and makeComposedFoldedAffineApply().
LogicalResult mlir::affine::loopUnrollByFactor | ( | AffineForOp | forOp, |
uint64_t | unrollFactor, | ||
function_ref< void(unsigned, Operation *, OpBuilder)> | annotateFn = nullptr , |
||
bool | cleanUpUnroll = false |
||
) |
Unrolls this for operation by the specified unroll factor.
Unrolls this loop by the specified factor.
Returns failure if the loop cannot be unrolled either due to restrictions or due to invalid unroll factors. Requires positive loop bounds and step. If specified, annotates the Ops in each unrolled iteration by applying annotateFn
. When cleanUpUnroll
is true, we can ensure the cleanup loop is unrolled regardless of the unroll factor.
Returns success if the loop is successfully unrolled.
Definition at line 1008 of file LoopUtils.cpp.
References generateCleanupLoopForUnroll(), generateUnrolledLoop(), getConstantTripCount(), getLargestDivisorOfTripCount(), loopUnrollFull(), and promoteIfSingleIteration().
Referenced by loopUnrollFull(), and loopUnrollUpToFactor().
LogicalResult mlir::affine::loopUnrollFull | ( | AffineForOp | forOp | ) |
Unrolls this for operation completely if the trip count is known to be constant.
Unrolls this loop completely.
Returns failure otherwise.
Definition at line 885 of file LoopUtils.cpp.
References getConstantTripCount(), loopUnrollByFactor(), and promoteIfSingleIteration().
Referenced by affineForOpBodySkew(), and loopUnrollByFactor().
LogicalResult mlir::affine::loopUnrollJamByFactor | ( | AffineForOp | forOp, |
uint64_t | unrollJamFactor | ||
) |
Unrolls and jams this loop by the specified factor.
forOp
can be a loop with iteration arguments performing supported reductions and its inner loops can have iteration arguments. Returns success if the loop is successfully unroll-jammed.
Definition at line 1098 of file LoopUtils.cpp.
References areInnerBoundsInvariant(), mlir::OpBuilder::clone(), mlir::OpBuilder::create(), generateCleanupLoopForUnroll(), mlir::AffineMap::get(), mlir::Builder::getAffineDimExpr(), getConstantTripCount(), mlir::Value::getDefiningOp(), getLargestDivisorOfTripCount(), mlir::arith::getReductionOp(), getSupportedReductions(), promoteIfSingleIteration(), mlir::OpBuilder::setInsertionPointAfter(), mlir::JamBlockGatherer< OpTy >::subBlocks, mlir::Value::use_empty(), and mlir::JamBlockGatherer< OpTy >::walk().
Referenced by loopUnrollJamUpToFactor().
LogicalResult mlir::affine::loopUnrollJamUpToFactor | ( | AffineForOp | forOp, |
uint64_t | unrollJamFactor | ||
) |
Unrolls and jams this loop by the specified factor or by the trip count (if constant), whichever is lower.
Definition at line 1075 of file LoopUtils.cpp.
References getConstantTripCount(), and loopUnrollJamByFactor().
LogicalResult mlir::affine::loopUnrollUpToFactor | ( | AffineForOp | forOp, |
uint64_t | unrollFactor | ||
) |
Unrolls this loop by the specified unroll factor or its trip count, whichever is lower.
Unrolls this loop by the specified factor or by the trip count (if constant) whichever is lower.
Definition at line 900 of file LoopUtils.cpp.
References getConstantTripCount(), and loopUnrollByFactor().
AffineApplyOp mlir::affine::makeComposedAffineApply | ( | OpBuilder & | b, |
Location | loc, | ||
AffineExpr | e, | ||
ArrayRef< OpFoldResult > | operands | ||
) |
Definition at line 1154 of file AffineOps.cpp.
References mlir::Builder::getContext(), mlir::AffineMap::inferFromExprList(), and makeComposedAffineApply().
AffineApplyOp mlir::affine::makeComposedAffineApply | ( | OpBuilder & | b, |
Location | loc, | ||
AffineMap | map, | ||
ArrayRef< OpFoldResult > | operands | ||
) |
Returns a composed AffineApplyOp by composing map
and operands
with other AffineApplyOps supplying those operands.
The operands of the resulting AffineApplyOp do not change the length of AffineApplyOp chains.
Definition at line 1144 of file AffineOps.cpp.
References composeAffineMapAndOperands(), mlir::OpBuilder::create(), and mlir::foldAttributesIntoMap().
Referenced by commonLinearIdBuilderFn(), mlir::linalg::computeContinuousTileSizes(), mlir::linalg::computeMultiTileSizes(), mlir::gpu::WarpDistributionPattern::delinearizeLaneId(), getCollapsedOutputDimFromInputShape(), mlir::linalg::getConvolvedIndex(), getDivMod(), getExpandedOutputDimFromInputShape(), getXferIndices(), invertSliceIndexing(), makeComposedAffineApply(), makeComposedFoldedAffineApply(), mlir::memref::multiBuffer(), and mlir::linalg::updateBoundsForCyclicDistribution().
AffineMinOp mlir::affine::makeComposedAffineMin | ( | OpBuilder & | b, |
Location | loc, | ||
AffineMap | map, | ||
ArrayRef< OpFoldResult > | operands | ||
) |
Returns an AffineMinOp obtained by composing map
and operands
with AffineApplyOps supplying those operands.
Definition at line 1261 of file AffineOps.cpp.
OpFoldResult mlir::affine::makeComposedFoldedAffineApply | ( | OpBuilder & | b, |
Location | loc, | ||
AffineExpr | expr, | ||
ArrayRef< OpFoldResult > | operands | ||
) |
Variant of makeComposedFoldedAffineApply
that applies to an expression.
Definition at line 1230 of file AffineOps.cpp.
References mlir::Builder::getContext(), mlir::AffineMap::inferFromExprList(), and makeComposedFoldedAffineApply().
OpFoldResult mlir::affine::makeComposedFoldedAffineApply | ( | OpBuilder & | b, |
Location | loc, | ||
AffineMap | map, | ||
ArrayRef< OpFoldResult > | operands | ||
) |
Constructs an AffineApplyOp that applies map
to operands
after composing the map with the maps of any other AffineApplyOp supplying the operands, then immediately attempts to fold it.
If folding results in a constant value, no ops are actually created. The map
must be a single-result affine map.
Definition at line 1194 of file AffineOps.cpp.
References mlir::Builder::getContext(), mlir::OpBuilder::getInsertionBlock(), mlir::OpBuilder::getInsertionPoint(), mlir::OpBuilder::getListener(), mlir::AffineMap::getNumResults(), mlir::m_Constant(), makeComposedAffineApply(), mlir::matchPattern(), and mlir::OpBuilder::setInsertionPoint().
Referenced by mlir::affine::AffineBuilder::add(), mlir::tensor::bubbleUpPadSlice(), HopperBuilder::buildBarrierArriveTx(), buildLinearId(), HopperBuilder::buildTmaAsyncLoad(), calculateExpandedAccessIndices(), mlir::affine::AffineBuilder::ceil(), common3DIdBuilderFn(), commonLinearIdBuilderFn(), composedAffineMultiply(), mlir::linalg::computeContinuousTileSizes(), mlir::linalg::computeTileSizes(), createInBoundsCond(), mlir::tensor::createPadHighOp(), denormalizeInductionVariableForIndexType(), denormalizeIndVar(), emitNormalizedLoopBoundsForIndexType(), mlir::affine::AffineBuilder::floor(), getCollapsedIndices(), getCompressedMaskOp(), getFlatOffsetAndStrides(), getIndicesForLoadOrStore(), mlir::memref::getLinearizedMemRefOffsetAndSize(), getOffsetForBitwidth(), getProductOfIndexes(), getTileOffsetAndSizes(), getUserTileSizesAndNumThreads(), linearizeIndex(), mlir::linalg::lowerPack(), makeComposedFoldedAffineApply(), mergeOffsetsSizesAndStrides(), mlir::affine::AffineBuilder::mul(), normalizeUpperBounds(), mlir::linalg::offsetIndices(), resolveIndicesIntoOpWithOffsetsAndStrides(), resolveSourceIndicesCollapseShape(), resolveSourceIndicesExpandShape(), CopyBuilder::rewrite(), mlir::linalg::splitOp(), and mlir::affine::AffineBuilder::sub().
OpFoldResult mlir::affine::makeComposedFoldedAffineMax | ( | OpBuilder & | b, |
Location | loc, | ||
AffineMap | map, | ||
ArrayRef< OpFoldResult > | operands | ||
) |
Constructs an AffineMinOp that computes a maximum across the results of applying map
to operands
, then immediately attempts to fold it.
If folding results in a constant value, no ops are actually created.
Definition at line 1307 of file AffineOps.cpp.
Referenced by mlir::tensor::bubbleUpPadSlice(), buildMax(), getTileOffsetAndSizes(), and mlir::affine::AffineBuilder::max().
OpFoldResult mlir::affine::makeComposedFoldedAffineMin | ( | OpBuilder & | b, |
Location | loc, | ||
AffineMap | map, | ||
ArrayRef< OpFoldResult > | operands | ||
) |
Constructs an AffineMinOp that computes a minimum across the results of applying map
to operands
, then immediately attempts to fold it.
If folding results in a constant value, no ops are actually created.
Definition at line 1300 of file AffineOps.cpp.
Referenced by mlir::tensor::bubbleUpPadSlice(), buildMin(), getBoundedTileSize(), getTileOffsetAndSizes(), mlir::affine::AffineBuilder::min(), and mlir::linalg::splitOp().
SmallVector< OpFoldResult > mlir::affine::makeComposedFoldedMultiResultAffineApply | ( | OpBuilder & | b, |
Location | loc, | ||
AffineMap | map, | ||
ArrayRef< OpFoldResult > | operands | ||
) |
Variant of makeComposedFoldedAffineApply
suitable for multi-result maps.
Note that this may create as many affine.apply operations as the map has results given that affine.apply must be single-result.
Definition at line 1241 of file AffineOps.cpp.
References mlir::AffineMap::getNumResults().
Referenced by mlir::linalg::computeMultiTileSizes(), getGenericOpLoopRange(), mlir::linalg::makeTiledLoopRanges(), and tileLinalgOpImpl().
void mlir::affine::mapLoopToProcessorIds | ( | scf::ForOp | forOp, |
ArrayRef< Value > | processorId, | ||
ArrayRef< Value > | numProcessors | ||
) |
Maps forOp
for execution on a parallel grid of virtual processorIds
of size given by numProcessors
.
This is achieved by embedding the SSA values corresponding to processorIds
and numProcessors
into the bounds and step of the forOp
. No check is performed on the legality of the rewrite, it is the caller's responsibility to ensure legality.
Requires that processorIds
and numProcessors
have the same size and that for each idx, processorIds
[idx] takes, at runtime, all values between 0 and numProcessors
[idx] - 1. This corresponds to traditional use cases for:
Example: Assuming a 2-d grid with processorIds = [blockIdx.x, threadIdx.x] and numProcessors = [gridDim.x, blockDim.x], the loop:
is rewritten into a version resembling the following pseudo-IR:
Definition at line 1725 of file LoopUtils.cpp.
References mlir::bindSymbols(), mlir::OpBuilder::create(), and mlir::AffineMap::get().
Referenced by mlir::linalg::GenerateLoopNest< LoopTy >::doit(), insertCopyLoops(), and mlir::linalg::tileReductionUsingForall().
OpFoldResult mlir::affine::materializeComputedBound | ( | OpBuilder & | b, |
Location | loc, | ||
AffineMap | boundMap, | ||
ArrayRef< std::pair< Value, std::optional< int64_t >>> | mapOperands | ||
) |
Materialize an already computed bound with Affine dialect ops.
ValueBoundsOpInterface::computeBound
computes bounds but does not create IR. It is dialect independent.materializeComputedBound
materializes computed bounds with Affine dialect ops.reifyIndexValueBound
/reifyShapedValueDimBound
are a combination of the two functions mentioned above. Definition at line 34 of file ReifyValueBounds.cpp.
References canonicalizeMapAndOperands(), mlir::OpBuilder::create(), mlir::Builder::getIndexAttr(), mlir::AffineMap::getNumDims(), mlir::AffineMap::getResult(), mlir::AffineMap::getSingleConstantResult(), mlir::Value::getType(), mlir::Type::isIndex(), and mlir::AffineMap::isSingleConstant().
Referenced by makeIndependent(), and reifyValueBound().
LogicalResult mlir::affine::mergeOffsetsSizesAndStrides | ( | OpBuilder & | builder, |
Location | loc, | ||
ArrayRef< OpFoldResult > | producerOffsets, | ||
ArrayRef< OpFoldResult > | producerSizes, | ||
ArrayRef< OpFoldResult > | producerStrides, | ||
const llvm::SmallBitVector & | droppedProducerDims, | ||
ArrayRef< OpFoldResult > | consumerOffsets, | ||
ArrayRef< OpFoldResult > | consumerSizes, | ||
ArrayRef< OpFoldResult > | consumerStrides, | ||
SmallVector< OpFoldResult > & | combinedOffsets, | ||
SmallVector< OpFoldResult > & | combinedSizes, | ||
SmallVector< OpFoldResult > & | combinedStrides | ||
) |
Fills the combinedOffsets
, combinedSizes
and combinedStrides
to use when combining a producer slice into a consumer slice.
This function performs the following computation:
Definition at line 17 of file ViewLikeInterfaceUtils.cpp.
References mlir::bindSymbols(), mlir::Builder::getContext(), and makeComposedFoldedAffineApply().
Referenced by mergeOffsetsSizesAndStrides().
LogicalResult mlir::affine::mergeOffsetsSizesAndStrides | ( | OpBuilder & | builder, |
Location | loc, | ||
OffsetSizeAndStrideOpInterface | producer, | ||
OffsetSizeAndStrideOpInterface | consumer, | ||
const llvm::SmallBitVector & | droppedProducerDims, | ||
SmallVector< OpFoldResult > & | combinedOffsets, | ||
SmallVector< OpFoldResult > & | combinedSizes, | ||
SmallVector< OpFoldResult > & | combinedStrides | ||
) |
Fills the combinedOffsets
, combinedSizes
and combinedStrides
to use when combining a producer
slice op into a consumer
slice op.
Definition at line 62 of file ViewLikeInterfaceUtils.cpp.
References mergeOffsetsSizesAndStrides().
|
inline |
Returns true if the provided DependenceResult corresponds to the absence of a dependence.
Definition at line 185 of file AffineAnalysis.h.
References mlir::affine::DependenceResult::NoDependence, and mlir::affine::DependenceResult::value.
Referenced by mayHaveEffect().
LogicalResult mlir::affine::normalizeAffineFor | ( | AffineForOp | op, |
bool | promoteSingleIter = false |
||
) |
Normalize an affine.for op.
An affine.for op is normalized by converting the lower bound to zero and loop step to one. The upper bound is set to the trip count of the loop. Original loops must have a lower bound with only a single result. There is no such restriction on upper bounds. Returns success if the loop has been normalized (or is already in the normal form). If promoteSingleIter
is true, the loop is simply promoted if it has a single iteration.
Definition at line 556 of file Utils.cpp.
References mlir::affine::AffineValueMap::canonicalize(), mlir::AffineExpr::ceilDiv(), mlir::AffineMap::compose(), mlir::OpBuilder::create(), mlir::affine::AffineValueMap::difference(), mlir::AffineMap::get(), mlir::Builder::getAffineDimExpr(), mlir::affine::AffineValueMap::getAffineMap(), mlir::Builder::getConstantAffineMap(), mlir::AffineMap::getNumDims(), mlir::affine::AffineValueMap::getNumResults(), mlir::AffineMap::getNumSymbols(), mlir::affine::AffineValueMap::getOperands(), promoteIfSingleIteration(), and mlir::OpBuilder::setInsertionPointToStart().
void mlir::affine::normalizeAffineParallel | ( | AffineParallelOp | op | ) |
Normalize a affine.parallel op so that lower bounds are 0 and steps are 1.
As currently implemented, this transformation cannot fail and will return early if the op is already in a normalized form.
Definition at line 492 of file Utils.cpp.
References mlir::OpBuilder::atBlockBegin(), mlir::AffineExpr::ceilDiv(), mlir::affine::AffineValueMap::difference(), mlir::AffineMap::get(), mlir::affine::AffineValueMap::getNumDims(), mlir::AffineMap::getNumDims(), mlir::affine::AffineValueMap::getNumSymbols(), mlir::AffineMap::getNumSymbols(), mlir::affine::AffineValueMap::getOperands(), mlir::affine::AffineValueMap::getResult(), mlir::AffineMap::getResult(), mlir::AffineMap::getResults(), and mlir::Value::replaceAllUsesExcept().
LogicalResult mlir::affine::normalizeMemRef | ( | memref::AllocOp * | op | ) |
Rewrites the memref defined by this alloc op to have an identity layout map and updates all its indexing uses.
Returns failure if any of its uses escape (while leaving the IR in a valid state).
Definition at line 1786 of file Utils.cpp.
References mlir::OpBuilder::create(), createNewDynamicSizes(), mlir::Operation::erase(), getTileSizePos(), mlir::Value::getType(), mlir::Value::getUsers(), normalizeMemRefType(), replaceAllMemRefUsesWith(), and mlir::Value::replaceAllUsesWith().
MemRefType mlir::affine::normalizeMemRefType | ( | MemRefType | memrefType | ) |
Normalizes memrefType
so that the affine layout map of the memref is transformed to an identity map with a new shape being computed for the normalized memref type and returns it.
The old memref type is simplify returned if the normalization failed.
Definition at line 1843 of file Utils.cpp.
References mlir::affine::FlatAffineValueConstraints::addBound(), mlir::FlatLinearConstraints::composeMatchingMap(), mlir::get(), mlir::presburger::IntegerRelation::getConstantBound64(), mlir::AffineMap::getMultiDimIdentityMap(), mlir::presburger::IntegerRelation::getNumLocalVars(), mlir::AffineMap::getNumResults(), mlir::AffineMap::getNumSymbols(), mlir::presburger::IntegerRelation::getNumVars(), getTileSizePos(), isNormalizedMemRefDynamicDim(), mlir::FlatLinearValueConstraints::projectOut(), mlir::MemRefType::Builder::setLayout(), and mlir::MemRefType::Builder::setShape().
Referenced by normalizeMemRef().
int64_t mlir::affine::numEnclosingInvariantLoops | ( | OpOperand & | operand | ) |
Count the number of loops surrounding operand
such that operand could be hoisted above.
Stop counting at the first loop over which the operand cannot be hoisted. This counts any LoopLikeOpInterface, not just affine.for.
Definition at line 2787 of file LoopUtils.cpp.
References mlir::IROperand< DerivedT, IRValueT >::get(), mlir::detail::IROperandBase::getOwner(), and mlir::Operation::getParentOfType().
ParseResult mlir::affine::parseDimAndSymbolList | ( | OpAsmParser & | parser, |
SmallVectorImpl< Value > & | operands, | ||
unsigned & | numDims | ||
) |
Parses dimension and symbol list.
Parses dimension and symbol list and returns true if parsing failed.
numDims
is set to the number of dimensions in the list parsed.
Definition at line 481 of file AffineOps.cpp.
References mlir::AsmParser::getBuilder(), mlir::Builder::getIndexType(), mlir::AsmParser::OptionalSquare, mlir::AsmParser::Paren, mlir::OpAsmParser::parseOperandList(), and mlir::OpAsmParser::resolveOperands().
Referenced by parseBound().
unsigned mlir::affine::permuteLoops | ( | ArrayRef< AffineForOp > | inputNest, |
ArrayRef< unsigned > | permMap | ||
) |
Performs a loop permutation on a perfectly nested loop nest inputNest
(where the contained loops appear from outer to inner) as specified by the permutation permMap
: loop 'i' in inputNest
is mapped to location 'loopPermMap[i]', where positions 0, 1, ...
are from the outermost position to inner. Returns the position in inputNest
of the AffineForOp that becomes the new outermost loop of this nest. This method always succeeds, asserts out on invalid input / specifications.
Definition at line 1387 of file LoopUtils.cpp.
References mlir::Block::begin(), mlir::Block::end(), mlir::detail::enumerate(), mlir::Block::getOperations(), and isPerfectlyNested().
Referenced by sinkSequentialLoops().
void mlir::affine::populateAffineExpandIndexOpsAsAffinePatterns | ( | RewritePatternSet & | patterns | ) |
Populate patterns that expand affine index operations into their equivalent affine.apply
representations.
Definition at line 89 of file AffineExpandIndexOpsAsAffine.cpp.
References mlir::patterns.
void mlir::affine::populateAffineExpandIndexOpsPatterns | ( | RewritePatternSet & | patterns | ) |
Populate patterns that expand affine index operations into more fundamental operations (not necessarily restricted to Affine dialect).
Definition at line 208 of file AffineExpandIndexOps.cpp.
References mlir::patterns.
LogicalResult mlir::affine::promoteIfSingleIteration | ( | AffineForOp | forOp | ) |
Promotes the loop body of a AffineForOp to its containing block if the loop was known to have a single iteration.
Promotes the loop body of a forOp to its containing block if the forOp was known to have a single iteration.
Definition at line 118 of file LoopUtils.cpp.
References mlir::OpBuilder::create(), getConstantTripCount(), mlir::Builder::getDimIdentityMap(), mlir::Operation::replaceAllUsesWith(), and replaceIterArgsAndYieldResults().
Referenced by fuseLoops(), generateCleanupLoopForUnroll(), generateShiftedLoop(), loopUnrollByFactor(), loopUnrollFull(), loopUnrollJamByFactor(), and normalizeAffineFor().
void mlir::affine::promoteSingleIterationLoops | ( | func::FuncOp | f | ) |
Promotes all single iteration AffineForOp's in the Function, i.e., moves their body into the containing Block.
void mlir::scf::registerTransformDialectExtension | ( | DialectRegistry & | registry | ) |
Definition at line 178 of file AffineTransformOps.cpp.
References mlir::DialectRegistry::addExtensions().
Referenced by mlir::registerAllExtensions().
void mlir::affine::registerValueBoundsOpInterfaceExternalModels | ( | DialectRegistry & | registry | ) |
Definition at line 97 of file ValueBoundsOpInterfaceImpl.cpp.
References mlir::DialectRegistry::addExtension().
Referenced by mlir::linalg::registerAllDialectInterfaceImplementations(), and mlir::registerAllDialects().
FailureOr< OpFoldResult > mlir::affine::reifyIndexValueBound | ( | OpBuilder & | b, |
Location | loc, | ||
presburger::BoundType | type, | ||
Value | value, | ||
ValueBoundsConstraintSet::StopConditionFn | stopCondition = nullptr , |
||
bool | closedUB = false |
||
) |
Reify a bound for the given index-typed value in terms of SSA values for which stopCondition
is met.
If no stop condition is specified, reify in terms of the operands of the owner op.
By default, lower/equal bounds are closed and upper bounds are open. If closedUB
is set to "true", upper bounds are also closed.
Example: %0 = arith.addi a, b : index %1 = arith.addi %0, c : index
stopCondition
evaluates to "true" for %0 and c, "%0 + %c" is an EQ bound for %1.stopCondition
evaluates to "true" for a, b and c, "%a + %b + %c" is an EQ bound for %1.Definition at line 100 of file ReifyValueBounds.cpp.
References reifyValueBound().
FailureOr< OpFoldResult > mlir::affine::reifyShapedValueDimBound | ( | OpBuilder & | b, |
Location | loc, | ||
presburger::BoundType | type, | ||
Value | value, | ||
int64_t | dim, | ||
ValueBoundsConstraintSet::StopConditionFn | stopCondition = nullptr , |
||
bool | closedUB = false |
||
) |
Reify a bound for the specified dimension of the given shaped value in terms of SSA values for which stopCondition
is met.
If no stop condition is specified, reify in terms of the operands of the owner op.
By default, lower/equal bounds are closed and upper bounds are open. If closedUB
is set to "true", upper bounds are also closed.
Definition at line 82 of file ReifyValueBounds.cpp.
References reifyValueBound().
FailureOr< OpFoldResult > mlir::affine::reifyValueBound | ( | OpBuilder & | b, |
Location | loc, | ||
presburger::BoundType | type, | ||
const ValueBoundsConstraintSet::Variable & | var, | ||
ValueBoundsConstraintSet::StopConditionFn | stopCondition, | ||
bool | closedUB = false |
||
) |
Reify a bound for the given variable in terms of SSA values for which stopCondition
is met.
By default, lower/equal bounds are closed and upper bounds are open. If closedUB
is set to "true", upper bounds are also closed.
Definition at line 19 of file ReifyValueBounds.cpp.
References mlir::ValueBoundsConstraintSet::computeBound(), and materializeComputedBound().
Referenced by reifyIndexValueBound(), and reifyShapedValueDimBound().
void mlir::affine::reorderOperandsByHoistability | ( | RewriterBase & | rewriter, |
AffineApplyOp | op | ||
) |
Helper function to rewrite op
's affine map and reorder its operands such that they are in increasing order of hoistability (i.e.
the least hoistable) operands come first in the operand list.
Definition at line 42 of file DecomposeAffineOps.cpp.
References canonicalizeMapAndOperands(), mlir::RewriterBase::finalizeOpModification(), mlir::AffineMap::get(), mlir::getAffineSymbolExpr(), mlir::AffineExpr::getContext(), mlir::AffineMap::getNumDims(), mlir::AffineMap::getResult(), mlir::AffineMap::replaceDimsAndSymbols(), mlir::simplifyAffineExpr(), and mlir::RewriterBase::startOpModification().
LogicalResult mlir::affine::replaceAllMemRefUsesWith | ( | Value | oldMemRef, |
Value | newMemRef, | ||
ArrayRef< Value > | extraIndices = {} , |
||
AffineMap | indexRemap = AffineMap() , |
||
ArrayRef< Value > | extraOperands = {} , |
||
ArrayRef< Value > | symbolOperands = {} , |
||
Operation * | domOpFilter = nullptr , |
||
Operation * | postDomOpFilter = nullptr , |
||
bool | allowNonDereferencingOps = false , |
||
bool | replaceInDeallocOp = false |
||
) |
Replaces all "dereferencing" uses of oldMemRef
with newMemRef
while optionally remapping the old memref's indices using the supplied affine map, indexRemap
.
The new memref could be of a different shape or rank. extraIndices
provides any additional access indices to be added to the start.
indexRemap
remaps indices of the old memref access to a new set of indices that are used to index the memref. Additional input operands to indexRemap can be optionally provided in extraOperands
, and they occupy the start of its input list. indexRemap
's dimensional inputs are expected to correspond to memref's indices, and its symbolic inputs if any should be provided in symbolOperands
.
domOpFilter
, if non-null, restricts the replacement to only those operations that are dominated by the former; similarly, postDomOpFilter
restricts replacement to only those operations that are postdominated by it.
'allowNonDereferencingOps', if set, allows replacement of non-dereferencing uses of a memref without any requirement for access index rewrites as long as the user operation has the MemRefsNormalizable trait. The default value of this flag is false.
'replaceInDeallocOp', if set, lets DeallocOp, a non-dereferencing user, to also be a candidate for replacement. The default value of this flag is false.
Returns true on success and false if the replacement is not possible, whenever a memref is used as an operand in a non-dereferencing context and 'allowNonDereferencingOps' is false, except for dealloc's on the memref which are left untouched. See comments at function definition for an example.
Definition at line 1366 of file Utils.cpp.
References mlir::AffineMap::getNumInputs(), mlir::AffineMap::getNumResults(), mlir::AffineMap::getNumSymbols(), mlir::Operation::getParentOfType(), mlir::Value::getType(), and mlir::Value::getUsers().
Referenced by createPrivateMemRef(), doubleBuffer(), generateCopy(), and normalizeMemRef().
LogicalResult mlir::affine::replaceAllMemRefUsesWith | ( | Value | oldMemRef, |
Value | newMemRef, | ||
Operation * | op, | ||
ArrayRef< Value > | extraIndices = {} , |
||
AffineMap | indexRemap = AffineMap() , |
||
ArrayRef< Value > | extraOperands = {} , |
||
ArrayRef< Value > | symbolOperands = {} , |
||
bool | allowNonDereferencingOps = false |
||
) |
Performs the same replacement as the other version above but only for the dereferencing uses of oldMemRef
in op
, except in cases where 'allowNonDereferencingOps' is set to true where we replace the non-dereferencing uses as well.
Definition at line 1182 of file Utils.cpp.
References canonicalizeMapAndOperands(), mlir::OpBuilder::create(), mlir::detail::enumerate(), fullyComposeAffineMapAndOperands(), mlir::AffineMap::get(), mlir::Operation::getLoc(), mlir::Builder::getMultiDimIdentityMap(), mlir::AffineMap::getNumDims(), mlir::AffineMap::getNumInputs(), mlir::AffineMap::getNumResults(), mlir::AffineMap::getNumSymbols(), mlir::Operation::getOperands(), mlir::AffineMap::getResults(), mlir::Value::getType(), mlir::NamedAttribute::getValue(), isValidDim(), isValidSymbol(), mlir::Operation::operand_begin(), mlir::Operation::setOperand(), mlir::simplifyAffineMap(), and transformMemRefLoadWithReducedRank().
void mlir::affine::resolveIndicesIntoOpWithOffsetsAndStrides | ( | RewriterBase & | rewriter, |
Location | loc, | ||
ArrayRef< OpFoldResult > | mixedSourceOffsets, | ||
ArrayRef< OpFoldResult > | mixedSourceStrides, | ||
const llvm::SmallBitVector & | rankReducedDims, | ||
ArrayRef< OpFoldResult > | consumerIndices, | ||
SmallVectorImpl< Value > & | resolvedIndices | ||
) |
Given the 'consumerIndices' of a load/store operation operating on an op with offsets and strides, return the combined indices.
For example, using memref.load
and memref.subview
as an illustration:
could be folded into:
Definition at line 81 of file ViewLikeInterfaceUtils.cpp.
References mlir::bindSymbols(), mlir::AffineMap::get(), mlir::Builder::getContext(), mlir::Builder::getIndexAttr(), mlir::getValueOrCreateConstantIndexOp(), and makeComposedFoldedAffineApply().
Referenced by InsertSliceOfInsertSliceFolder< OpTy >::matchAndRewrite(), and resolveIndicesIntoOpWithOffsetsAndStrides().
|
inline |
Definition at line 82 of file ViewLikeInterfaceUtils.h.
References mlir::getAsOpFoldResult(), and resolveIndicesIntoOpWithOffsetsAndStrides().
void mlir::affine::resolveSizesIntoOpWithSizes | ( | ArrayRef< OpFoldResult > | sourceSizes, |
ArrayRef< OpFoldResult > | destSizes, | ||
const llvm::SmallBitVector & | rankReducedSourceDims, | ||
SmallVectorImpl< OpFoldResult > & | resolvedSizes | ||
) |
Given sourceSizes
, destSizes
and information about which dimensions are dropped by the source: rankReducedSourceDims
, compute the resolved sizes that correspond to dest_op(source_op).
In practice, this amounts to filtering by rankReducedSourceDims
and taking from sourceSizes
if a dimension is dropped, otherwise taking from destSizes
.
Definition at line 113 of file ViewLikeInterfaceUtils.cpp.
Referenced by InsertSliceOfInsertSliceFolder< OpTy >::matchAndRewrite().
LogicalResult mlir::affine::separateFullTiles | ( | MutableArrayRef< AffineForOp > | nest, |
SmallVectorImpl< AffineForOp > * | fullTileNest = nullptr |
||
) |
Separates full tiles from partial tiles for a perfect nest nest
by generating a conditional guard that selects between the full tile version and the partial tile version using an AffineIfOp.
The original loop nest is replaced by this guarded two version form.
affine.if (cond) // full_tile else // partial tile
Definition at line 2685 of file LoopUtils.cpp.
References createFullTiles(), createSeparationCondition(), mlir::Block::end(), and mlir::Block::getOperations().
FailureOr< AffineValueMap > mlir::affine::simplifyConstrainedMinMaxOp | ( | Operation * | op, |
FlatAffineValueConstraints | constraints | ||
) |
Try to simplify the given affine.min or affine.max op to an affine map with a single result and operands, taking into account the specified constraint set.
Return failure if no simplified version could be found.
Definition at line 2066 of file Utils.cpp.
References mlir::affine::FlatAffineValueConstraints::addBound(), addConstToResults(), mlir::presburger::IntegerRelation::addInequality(), alignAndAddBound(), mlir::FlatLinearValueConstraints::appendDimVar(), canonicalizeMapAndOperands(), mlir::Builder::getAffineConstantExpr(), mlir::Builder::getAffineDimExpr(), mlir::Builder::getAffineSymbolExpr(), mlir::presburger::IntegerRelation::getConstantBound64(), mlir::getConstantIntValue(), mlir::Operation::getContext(), mlir::FlatLinearValueConstraints::getMaybeValues(), mlir::presburger::IntegerRelation::getNumCols(), mlir::presburger::IntegerRelation::getNumDimAndSymbolVars(), mlir::AffineMap::getNumDims(), mlir::AffineMap::getNumResults(), mlir::AffineMap::getNumSymbols(), mlir::Operation::getOperands(), mlir::FlatLinearConstraints::getSliceBounds(), mlir::AffineMap::getSubMap(), mlir::presburger::IntegerRelation::isEmpty(), mlir::AffineMap::replace(), mlir::AffineMap::shiftDims(), and unpackOptionalValues().
Referenced by canonicalizeMinMaxOp().
IntegerSet mlir::affine::simplifyIntegerSet | ( | IntegerSet | set | ) |
Simplify the integer set by simplifying the underlying affine expressions by flattening and some simple inference.
Also, drop any duplicate constraints. Returns the simplified integer set. This method runs in time linear in the number of constraints.
Definition at line 1980 of file Utils.cpp.
References mlir::FlatLinearConstraints::getAsIntegerSet(), mlir::IntegerSet::getContext(), mlir::IntegerSet::getEmptySet(), mlir::IntegerSet::getNumDims(), mlir::IntegerSet::getNumSymbols(), mlir::presburger::IntegerRelation::isEmpty(), and mlir::presburger::IntegerRelation::removeTrivialRedundancy().
AffineForOp mlir::affine::sinkSequentialLoops | ( | AffineForOp | forOp | ) |
Definition at line 1457 of file LoopUtils.cpp.
References checkLoopInterchangeDependences(), getDependenceComponents(), getPerfectlyNestedLoops(), mlir::affine::matcher::isParallelLoop(), mlir::affine::DependenceComponent::lb, permuteLoops(), and mlir::affine::DependenceComponent::ub.
Referenced by sinkSequentialLoops().
AffineExpr mlir::affine::substWithMin | ( | AffineExpr | e, |
AffineExpr | dim, | ||
AffineExpr | min, | ||
AffineExpr | max, | ||
bool | positivePath = true |
||
) |
Traverse e
and return an AffineExpr where all occurrences of dim
have been replaced by either:
min
if positivePath
is true when we reach an occurrence of dim
max
if positivePath
is true when we reach an occurrence of dim
positivePath
is negated each time we hit a multiplicative or divisive binary op with a constant negative coefficient. Definition at line 465 of file Utils.cpp.
References mlir::Add, mlir::getAffineBinaryOpExpr(), max(), and min().
SmallVector< AffineForOp, 8 > mlir::affine::tile | ( | ArrayRef< AffineForOp > | forOps, |
ArrayRef< uint64_t > | sizes, | ||
AffineForOp | target | ||
) |
Performs tiling (with interchange) by strip-mining the forOps
by sizes
and sinking them, in their order of occurrence in forOps
, under target
.
Returns the new AffineForOps, one per forOps
, nested immediately under target
.
Definition at line 1600 of file LoopUtils.cpp.
References tile().
SmallVector< SmallVector< AffineForOp, 8 >, 8 > mlir::affine::tile | ( | ArrayRef< AffineForOp > | forOps, |
ArrayRef< uint64_t > | sizes, | ||
ArrayRef< AffineForOp > | targets | ||
) |
Performs tiling fo imperfectly nested loops (with interchange) by strip-mining the forOps
by sizes
and sinking them, in their order of occurrence in forOps
, under each of the targets
.
Returns the new AffineForOps, one per each of (forOps
, targets
) pair, nested immediately under each of targets
.
Definition at line 1588 of file LoopUtils.cpp.
References stripmineSink().
Referenced by tile(), and mlir::tilePerfectlyNested().
LogicalResult mlir::affine::tilePerfectlyNested | ( | MutableArrayRef< AffineForOp > | input, |
ArrayRef< unsigned > | tileSizes, | ||
SmallVectorImpl< AffineForOp > * | tiledNest = nullptr |
||
) |
Tiles the specified band of perfectly nested loops creating tile-space loops and intra-tile loops.
A band is a contiguous set of loops. This utility doesn't check for the validity of tiling itself, but just performs it.
Definition at line 772 of file LoopUtils.cpp.
References constructTiledIndexSetHyperRect(), constructTiledLoopNest(), extractForInductionVars(), and performPreTilingChecks().
LogicalResult mlir::affine::tilePerfectlyNestedParametric | ( | MutableArrayRef< AffineForOp > | input, |
ArrayRef< Value > | tileSizes, | ||
SmallVectorImpl< AffineForOp > * | tiledNest = nullptr |
||
) |
Tiles the specified band of perfectly nested loops creating tile-space loops and intra-tile loops, using SSA values as tiling parameters.
A band is a contiguous set of loops.
Definition at line 814 of file LoopUtils.cpp.
References constructParametricallyTiledIndexSetHyperRect(), constructTiledLoopNest(), extractForInductionVars(), and performPreTilingChecks().
LogicalResult mlir::affine::vectorizeAffineLoopNest | ( | std::vector< SmallVector< AffineForOp, 2 >> & | loops, |
const VectorizationStrategy & | strategy | ||
) |
External utility to vectorize affine loops from a single loop nest using an n-D vectorization strategy (see doc in VectorizationStrategy definition).
Loops are provided in a 2D vector container. The first dimension represents the nesting level relative to the loops to be vectorized. The second dimension contains the loops. This means that: a) every loop in 'loops[i]' must have a parent loop in 'loops[i-1]', b) a loop in 'loops[i]' may or may not have a child loop in 'loops[i+1]'.
For example, for the following loop nest:
func @vec2d(in0: memref<64x128x512xf32>, in1: memref<64x128x128xf32>, out0: memref<64x128x512xf32>, out1: memref<64x128x128xf32>) { affine.for i0 = 0 to 64 { affine.for i1 = 0 to 128 { affine.for i2 = 0 to 512 { ld = affine.load in0[i0, i1, i2] : memref<64x128x512xf32> affine.store ld, out0[i0, i1, i2] : memref<64x128x512xf32> } affine.for i3 = 0 to 128 { ld = affine.load in1[i0, i1, i3] : memref<64x128x128xf32> affine.store ld, out1[i0, i1, i3] : memref<64x128x128xf32> } } } return }
loops = {{i0}, {i2, i3}}, to vectorize the outermost and the two innermost loops; loops = {{i1}, {i2, i3}}, to vectorize the middle and the two innermost loops; loops = {{i2}}, to vectorize only the first innermost loop; loops = {{i3}}, to vectorize only the second innermost loop; loops = {{i1}}, to vectorize only the middle loop.
Definition at line 1885 of file SuperVectorize.cpp.
References vectorizeLoopNest(), and verifyLoopNesting().
void mlir::affine::vectorizeAffineLoops | ( | Operation * | parentOp, |
llvm::DenseSet< Operation *, DenseMapInfo< Operation * >> & | loops, | ||
ArrayRef< int64_t > | vectorSizes, | ||
ArrayRef< int64_t > | fastestVaryingPattern, | ||
const ReductionLoopMap & | reductionLoops = ReductionLoopMap() |
||
) |
Vectorizes affine loops in 'loops' using the n-D vectorization factors in 'vectorSizes'.
By default, each vectorization factor is applied inner-to-outer to the loops of each loop nest. 'fastestVaryingPattern' can be optionally used to provide a different loop vectorization order. If reductionLoops
is not empty, the given reduction loops may be vectorized along the reduction dimension. TODO: Vectorizing reductions is supported only for 1-D vectorization.