MLIR 22.0.0git
TileUsingInterface.cpp File Reference

Go to the source code of this file.

Macros

#define DEBUG_TYPE   "tile-using-interface"

Typedefs

using YieldTiledValuesFn
 Typedef for function that allows returning additional yielded values during yieldTiledValuesAndReplace.
using GenerateTiledBodyFn
 Typedef for function that implements the body of a tiled loop.

Functions

static SmallVector< int64_tfillInterchangeVector (ArrayRef< int64_t > interchangeVector, size_t iterationDomainSize)
 Helper method to adjust the interchange vector to match the iteration domain.
static LogicalResult verifyOptions (RewriterBase &rewriter, Location loc, const scf::SCFTilingOptions &options)
 Verify the tile size options are set in a consistent manner.
static std::tuple< SmallVector< OpFoldResult >, SmallVector< OpFoldResult > > getUserTileSizesAndNumThreads (RewriterBase &rewriter, TilingInterface op, ArrayRef< Range > iterationDomain, const scf::SCFTilingOptions &options)
 Method to instantiate the tile sizes and/or number of threads specified by the user.
static LogicalResult checkTileSizes (TilingInterface op, scf::SCFTilingOptions::LoopType loopType, ReductionTilingStrategy reductionStrategy, ArrayRef< OpFoldResult > givenTileSizes, ArrayRef< OpFoldResult > numThreads)
 Checks if any of the tiled loops are not parallel.
static SetVector< unsignedgetSanitizedReductionDims (ArrayRef< OpFoldResult > givenTileSizes, const scf::SCFTilingOptions &options)
 Get the reduction dims that are tiled.
static bool tileDividesIterationDomain (Range loopRange)
 Check if stride evenly divides the trip count size - offset.
static OpFoldResult getBoundedTileSize (OpBuilder &b, Location loc, Range loopRange, OpFoldResult offset, OpFoldResult givenTileSize)
 Returns the bounded tile size given the current offset, loopRange and tileSize, i.e., min(tileSize, range.end() - offset).
static bool canOmitTileOffsetInBoundsCheck (OpFoldResult givenTileSize, OpFoldResult numThreads, OpFoldResult iterationSize)
 Returns true if the maximum tile offset tileSize * numThreads-1 is less than iterationSize.
static std::tuple< SmallVector< OpFoldResult >, SmallVector< OpFoldResult > > getTileOffsetAndSizes (RewriterBase &rewriter, Location loc, ValueRange ivs, ArrayRef< Range > iterationDomain, ArrayRef< OpFoldResult > givenTileSizes)
 Compute the OpFoldResults that represents the multi-dimensional offsets and sizes of the tile of the iteration space that the innermost loop body of the generated tiled loops corresponds to.
static std::tuple< SmallVector< OpFoldResult >, SmallVector< OpFoldResult >, SmallVector< OpFoldResult > > getLoopBounds (RewriterBase &rewriter, Location loc, ArrayRef< Range > loopRanges, ArrayRef< OpFoldResult > givenTileSizes)
 Function to return the bounds of the loops to be generated.
static OperationcloneOpAndUpdateDestinationArgs (RewriterBase &rewriter, Operation *op, ValueRange newDestArgs)
 Clones the operation and updates the destination if the operation implements the DestinationStyleOpInterface.
static FailureOr< SmallVector< LoopLikeOpInterface > > generateLoopNestUsingForOp (RewriterBase &rewriter, Location loc, ArrayRef< Range > loopRanges, ArrayRef< OpFoldResult > givenTileSizes, ValueRange outerDestinationTensors, GenerateTiledBodyFn tiledBodyFn)
 Generate the tile-loop nest using scf.for operation.
static std::tuple< SmallVector< OpFoldResult >, SmallVector< OpFoldResult > > getTileOffsetAndSizesWithForAllOp (RewriterBase &rewriter, Location loc, ValueRange ivs, ArrayRef< Range > iterationDomain, ArrayRef< OpFoldResult > givenTileSizes, ArrayRef< OpFoldResult > numThreads)
 Compute the OpFoldResults that represents the multi-dimensional offsets and sizes of the tile of the iteration space that the innermost loop body of the generated tiled loops corresponds to when tiling using forall op.
static FailureOr< SmallVector< LoopLikeOpInterface > > generateLoopNestUsingForallOp (RewriterBase &rewriter, Location loc, ArrayRef< Range > loopRanges, ArrayRef< OpFoldResult > givenTileSizes, ArrayRef< OpFoldResult > numThreads, ArrayRef< Attribute > mappingVector, ValueRange outerDestinationTensors, GenerateTiledBodyFn tiledBodyFn)
 Generate the tile-loop nest using scf.forall operation.
static FailureOr< SmallVector< LoopLikeOpInterface > > generateLoopNestUsingCustomOp (RewriterBase &rewriter, Location loc, ArrayRef< Range > loopRanges, ArrayRef< OpFoldResult > givenTileSizes, ValueRange outerDestinationTensors, const scf::SCFTilingOptions::GenerateLoopHeaderFn &generateLoopHeaderFn, const scf::SCFTilingOptions::GenerateLoopTerminatorFn &generateLoopTerminatorFn, GenerateTiledBodyFn tiledBodyFn)
 Generate the tile-loop nest using custom loop operation.
static FailureOr< SmallVector< LoopLikeOpInterface > > generateLoopNest (RewriterBase &rewriter, Location loc, const scf::SCFTilingOptions &options, ArrayRef< Range > loopRanges, ArrayRef< OpFoldResult > givenTileSizes, ArrayRef< OpFoldResult > numThreads, ValueRange destinationTensors, GenerateTiledBodyFn tiledBodyFn)
 Generate the tile-loop nest using the loop construct specifed in options.
static FailureOr< SmallVector< Value > > createInitialTensorsForTiling (RewriterBase &rewriter, TilingInterface op, ReductionTilingStrategy reductionStrategy, ArrayRef< Range > iterationDomain, ArrayRef< OpFoldResult > numThreads, ArrayRef< OpFoldResult > givenTileSizes, const SetVector< unsigned > &reductionDims)
static SmallVector< OpFoldResultgetSplitReductionIvs (RewriterBase &rewriter, Location loc, ReductionTilingStrategy reductionStrategy, ValueRange ivs, ArrayRef< OpFoldResult > numThreads, ArrayRef< OpFoldResult > givenTileSizes, const SetVector< unsigned > &reductionDims)
 For the case of ReductionTilingStrategy::PartialReductionOuterParallel the PartialReductionOpInterface methods need the index of the parallel split reduction being executed.
static FailureOr< TilingResultgetTiledImplementation (RewriterBase &rewriter, TilingInterface op, ReductionTilingStrategy reductionStrategy, ValueRange regionIterArg, ArrayRef< OpFoldResult > offsets, ArrayRef< OpFoldResult > sizes, ValueRange ivs, ArrayRef< OpFoldResult > numThreads, ArrayRef< OpFoldResult > givenTileSizes, const SetVector< unsigned > &reductionDims)
static LogicalResult getResultTilePosition (RewriterBase &rewriter, ReductionTilingStrategy reductionStrategy, int64_t index, Value tiledResult, TilingInterface op, ArrayRef< OpFoldResult > offsets, ArrayRef< OpFoldResult > sizes, ValueRange ivs, ArrayRef< OpFoldResult > numThreads, ArrayRef< OpFoldResult > givenTileSizes, const SetVector< unsigned > &reductionDims, SmallVector< OpFoldResult > &resultOffset, SmallVector< OpFoldResult > &resultSize)
static FailureOr< MergeResultmergeTilingResults (RewriterBase &rewriter, TilingInterface op, ReductionTilingStrategy reductionStrategy, const SetVector< unsigned > &reductionDims, ValueRange partialResults)
template<typename LoopType>
FailureOr< LoopLikeOpInterface > yieldTiledValuesAndReplaceLoop (LoopType loopOp, RewriterBase &rewriter, ValueRange newInitOperands, YieldTiledValuesFn yieldTiledValuesFn)
 Append the specified additional newInitOperands operands to the loops existing init operands (or similar), and replace loopOp with the new loop that has the additional init operands.
template<>
FailureOr< LoopLikeOpInterface > yieldTiledValuesAndReplaceLoop< scf::ForOp > (scf::ForOp loopOp, RewriterBase &rewriter, ValueRange newInitOperands, YieldTiledValuesFn yieldTiledValuesFn)
 Implementation of yieldTiledValuesAndReplaceLoop for scf.for.
template<>
FailureOr< LoopLikeOpInterface > yieldTiledValuesAndReplaceLoop< scf::ForallOp > (scf::ForallOp loopOp, RewriterBase &rewriter, ValueRange newInitOperands, YieldTiledValuesFn yieldTiledValuesFn)
 Implementation of yieldTiledValuesAndReplaceLoop for scf.forall
FailureOr< LoopLikeOpInterface > yieldTiledValuesAndReplaceLoop (LoopLikeOpInterface loopLikeOp, RewriterBase &rewriter, ValueRange newInitOperands, YieldTiledValuesFn yieldTiledValuesFn)
 Implementation of yieldTiledValuesAndReplaceLoop for LoopLikeOpInterface, that just dispatches to the implementation for each supported loop type.
static LogicalResult addInitOperandsToLoopNest (RewriterBase &rewriter, MutableArrayRef< LoopLikeOpInterface > loops, ValueRange newInitValues, YieldTiledValuesFn getNewTiledYieldsFn)
 Method to add new init values to a loop nest.
static std::tuple< OpResult, std::optional< OpOperand * > > getUntiledProducerFromSliceSource (OpOperand *source, ArrayRef< LoopLikeOpInterface > loops)
 Return the untiled producer whose slice is used in a tiled consumer.
static LogicalResult checkAssumptionForFusingConsumer (tensor::InsertSliceOp candidateSliceOp)
 A utility function that checks whether the only use of the result of a tensor.insert_slice op is in a scf.yield op.
static FailureOr< Operation * > getFirstUserOfLoop (Operation *loopOp)
 An utility to get the first user of the given loopOp.
static FailureOr< llvm::SetVector< Operation * > > checkAssumptionForLoop (Operation *loopOp, Operation *consumerOp, bool reorderOperations)
 This utility currently checks whether the first userOp of loop is NOT before the last defineOp of consumer operand.
static FailureOr< OpOperand * > getConsumerFromLoopUses (RewriterBase &rewriter, Operation *loopOp, unsigned resultNumber)
 Fetches the OpOperand of the first valid user (and use) of the value val which implements TilingInterface and DestinationStyleOpInterface.
static FailureOr< OpOperand * > getUntiledConsumerFromSlice (RewriterBase &rewriter, tensor::InsertSliceOp candidateSliceOp, MutableArrayRef< LoopLikeOpInterface > loops)
 Fetch the untiled consumer of the outermost scf.for's result which is yielded by a tensor.insert_slice from the innermost scf.for.
static FailureOr< OpOperand * > getUntiledConsumerFromSlice (RewriterBase &rewriter, tensor::ParallelInsertSliceOp candidateSliceOp, MutableArrayRef< LoopLikeOpInterface > loops)
 Fetch the first untiled consumer of a scf.forall's result which is yielded by a tensor.parallel_insert_slice.
static FailureOr< SmallVector< OpOperand * > > getUntiledConsumerOperandsFromSlices (RewriterBase &rewriter, ArrayRef< Operation * > sliceOps, MutableArrayRef< LoopLikeOpInterface > loops)
 A utility to fetch an untiled consumer of tensor.insert_slice/tensor.parallel_insert_slice.
template<typename InsertSliceOpTy>
static tensor::InsertSliceOp cloneAsInsertSlice (RewriterBase &rewriter, InsertSliceOpTy sliceOp)
template<>
tensor::InsertSliceOp cloneAsInsertSlice< tensor::InsertSliceOp > (RewriterBase &rewriter, tensor::InsertSliceOp insertSliceOp)
template<>
tensor::InsertSliceOp cloneAsInsertSlice< tensor::ParallelInsertSliceOp > (RewriterBase &rewriter, tensor::ParallelInsertSliceOp insertSliceOp)
static SmallVector< tensor::InsertSliceOp > cloneAsInsertSlices (RewriterBase &rewriter, ArrayRef< Operation * > candidateSlices)

Macro Definition Documentation

◆ DEBUG_TYPE

#define DEBUG_TYPE   "tile-using-interface"

Definition at line 33 of file TileUsingInterface.cpp.

Typedef Documentation

◆ GenerateTiledBodyFn

Initial value:
std::function<LogicalResult(
RewriterBase &rewriter, Location Loc, ValueRange ivs,
ValueRange outerDestinationTensors, SmallVector<Value> &tiledResults,
This class defines the main interface for locations in MLIR and acts as a non-nullable wrapper around...
Definition Location.h:76
This class coordinates the application of a rewrite on a set of IR, providing a way for clients to tr...
This class provides an abstraction over the different types of ranges over Values.
Definition ValueRange.h:387

Typedef for function that implements the body of a tiled loop.

  • ivs induction variable for the loop.
  • tileOffsets represents offsets for the tiled iteration space.
  • tileSizes represents the sizes for the tiled iteraiton space.
  • outerDestinationTensors tensor that holds the result. Is same size as the destination operands of the original operations.
  • tiledResults results of the tiled computation, corresponds to tiles of the original operation computed by the loop body. Should be same size as the destinationTensors
  • resultOffsets is of the same size as tiledResults and represents the offset to use when writing the corresponding element from tiledResults into destinationTensors.
  • resultOffsets is of the same size as tiledResults and represents the size to use when writing the corresponding element from tiledResults into destinationTensors. In case the method needs to return failure() the method is expected to clean up any inserted operations.

Definition at line 359 of file TileUsingInterface.cpp.

◆ YieldTiledValuesFn

Initial value:
std::function<LogicalResult(
RewriterBase &rewriter, Location loc, ValueRange ivs, ValueRange newBbArgs,
SmallVector<Value> &tiledValues,

Typedef for function that allows returning additional yielded values during yieldTiledValuesAndReplace.

  • ivs induction variable for the loop.
  • newBbArgs basic block arguments corresponding to newly added iter_args.
  • tiledValues the tiled values to return. Must be of same size as newbbArgs, each element of this array is inserted into the corresponding element in newbbArgs.
  • resultOffsets is of the same size as tiledValues and represents the offsets to use when inserting corresponding element from tiledValues into the element from newBbArgs.
  • resultSizes is of the same size as tiledValues and represents the size of the corresponding element from tiledValues inserted into the element from newBbArgs. In case the method needs to return failure() the method is expected to clean up any inserted operations.

Definition at line 336 of file TileUsingInterface.cpp.

Function Documentation

◆ addInitOperandsToLoopNest()

LogicalResult addInitOperandsToLoopNest ( RewriterBase & rewriter,
MutableArrayRef< LoopLikeOpInterface > loops,
ValueRange newInitValues,
YieldTiledValuesFn getNewTiledYieldsFn )
static

Method to add new init values to a loop nest.

Updates loops in-place with new loops that use the newInitValues. The outer-loops are updated to yield the new result values of the inner loop. For the innermost loop, the call back getNewYields is invoked to get the additional values to yield form the innermost loop.

Definition at line 1041 of file TileUsingInterface.cpp.

References b, mlir::RewriterBase::mergeBlocks(), mlir::RewriterBase::replaceOp(), mlir::RewriterBase::replaceOpWithNewOp(), mlir::OpBuilder::setInsertionPoint(), success(), and yieldTiledValuesAndReplaceLoop().

◆ canOmitTileOffsetInBoundsCheck()

bool canOmitTileOffsetInBoundsCheck ( OpFoldResult givenTileSize,
OpFoldResult numThreads,
OpFoldResult iterationSize )
static

Returns true if the maximum tile offset tileSize * numThreads-1 is less than iterationSize.

Definition at line 262 of file TileUsingInterface.cpp.

References mlir::getConstantIntValue().

Referenced by getTileOffsetAndSizesWithForAllOp().

◆ checkAssumptionForFusingConsumer()

LogicalResult checkAssumptionForFusingConsumer ( tensor::InsertSliceOp candidateSliceOp)
static

A utility function that checks whether the only use of the result of a tensor.insert_slice op is in a scf.yield op.

Definition at line 1856 of file TileUsingInterface.cpp.

References mlir::Operation::getBlock(), mlir::detail::IROperandBase::getOwner(), result, and success().

Referenced by getUntiledConsumerFromSlice().

◆ checkAssumptionForLoop()

FailureOr< llvm::SetVector< Operation * > > checkAssumptionForLoop ( Operation * loopOp,
Operation * consumerOp,
bool reorderOperations )
static

This utility currently checks whether the first userOp of loop is NOT before the last defineOp of consumer operand.

Because that we need to move the whole loop structure right before the firstUserOfLoop. This utility thus helps ensuring that no invalid IR is formed, i.e. no backward slice of consumerOp is dominated by the firstUserOfLoop. Saying that:

%0 = scf.for() {
...
}
...
%1 = firstUserOfLoop(%0)
...
%2 = lastDefOfConsumerOperand
...
%3 = consumerOp(%2)

If the firstUserOfLoop is before lastDefOfConsumerOperand, then it would be invalid to move the loopOp right before the firstUserOfLoop, a.k.a. use-def chain violation:

%0:2 = scf.for() {
// use before define error
%3 = tiledConsumerOp(%2)
}
%1 = firstUserOfLoop(%0)
...
%2 = lastDefOfConsumerOperand
Parameters
loopOploop operation
consumerOpconsumer operation
reorderOperationsthe flag controls whether to reorder the backward slice w.r.t. the defineOp of consumerOp operands.
Returns
: computed backward slice of consumerOp, but excluding those already dominates firstUserOfLoop.

Definition at line 1952 of file TileUsingInterface.cpp.

References mlir::getBackwardSlice(), getFirstUserOfLoop(), mlir::Operation::getOperands(), options, mlir::DominanceInfo::properlyDominates(), and result.

Referenced by getConsumerFromLoopUses().

◆ checkTileSizes()

LogicalResult checkTileSizes ( TilingInterface op,
scf::SCFTilingOptions::LoopType loopType,
ReductionTilingStrategy reductionStrategy,
ArrayRef< OpFoldResult > givenTileSizes,
ArrayRef< OpFoldResult > numThreads )
static

Checks if any of the tiled loops are not parallel.

Definition at line 155 of file TileUsingInterface.cpp.

References mlir::FullReduction, mlir::getConstantIntValue(), mlir::isConstantIntValue(), and success().

◆ cloneAsInsertSlice()

template<typename InsertSliceOpTy>
tensor::InsertSliceOp cloneAsInsertSlice ( RewriterBase & rewriter,
InsertSliceOpTy sliceOp )
static

Referenced by cloneAsInsertSlices().

◆ cloneAsInsertSlice< tensor::InsertSliceOp >()

template<>
tensor::InsertSliceOp cloneAsInsertSlice< tensor::InsertSliceOp > ( RewriterBase & rewriter,
tensor::InsertSliceOp insertSliceOp )

Definition at line 2151 of file TileUsingInterface.cpp.

References mlir::OpBuilder::clone().

◆ cloneAsInsertSlice< tensor::ParallelInsertSliceOp >()

template<>
tensor::InsertSliceOp cloneAsInsertSlice< tensor::ParallelInsertSliceOp > ( RewriterBase & rewriter,
tensor::ParallelInsertSliceOp insertSliceOp )

Definition at line 2159 of file TileUsingInterface.cpp.

◆ cloneAsInsertSlices()

SmallVector< tensor::InsertSliceOp > cloneAsInsertSlices ( RewriterBase & rewriter,
ArrayRef< Operation * > candidateSlices )
static

Definition at line 2168 of file TileUsingInterface.cpp.

References cloneAsInsertSlice().

◆ cloneOpAndUpdateDestinationArgs()

Operation * cloneOpAndUpdateDestinationArgs ( RewriterBase & rewriter,
Operation * op,
ValueRange newDestArgs )
static

Clones the operation and updates the destination if the operation implements the DestinationStyleOpInterface.

Definition at line 368 of file TileUsingInterface.cpp.

References mlir::OpBuilder::clone().

◆ createInitialTensorsForTiling()

◆ fillInterchangeVector()

SmallVector< int64_t > fillInterchangeVector ( ArrayRef< int64_t > interchangeVector,
size_t iterationDomainSize )
static

Helper method to adjust the interchange vector to match the iteration domain.

Definition at line 60 of file TileUsingInterface.cpp.

◆ generateLoopNest()

FailureOr< SmallVector< LoopLikeOpInterface > > generateLoopNest ( RewriterBase & rewriter,
Location loc,
const scf::SCFTilingOptions & options,
ArrayRef< Range > loopRanges,
ArrayRef< OpFoldResult > givenTileSizes,
ArrayRef< OpFoldResult > numThreads,
ValueRange destinationTensors,
GenerateTiledBodyFn tiledBodyFn )
static

Generate the tile-loop nest using the loop construct specifed in options.

  • options: Tiling options specified.
  • loopRanges specifies the lb, ub and step of the untiled iteration space.
  • tileSizes is the tile sizes to use. Zero represent untiled loops.
  • outerDestinationTensors are the init values to use for the outer most loop.
  • yieldTiledValuesFn is called to generated the loop body of the inner most loop. Returns the generated loops on success.

Definition at line 687 of file TileUsingInterface.cpp.

References generateLoopNestUsingCustomOp(), generateLoopNestUsingForallOp(), generateLoopNestUsingForOp(), mlir::isZeroInteger(), mlir::RewriterBase::notifyMatchFailure(), mlir::Range::offset, options, and mlir::Range::size.

◆ generateLoopNestUsingCustomOp()

FailureOr< SmallVector< LoopLikeOpInterface > > generateLoopNestUsingCustomOp ( RewriterBase & rewriter,
Location loc,
ArrayRef< Range > loopRanges,
ArrayRef< OpFoldResult > givenTileSizes,
ValueRange outerDestinationTensors,
const scf::SCFTilingOptions::GenerateLoopHeaderFn & generateLoopHeaderFn,
const scf::SCFTilingOptions::GenerateLoopTerminatorFn & generateLoopTerminatorFn,
GenerateTiledBodyFn tiledBodyFn )
static

Generate the tile-loop nest using custom loop operation.

  • loopRanges specifies the lb, ub and step of the untiled iteration space.
  • tileSizes is the tile sizes to use. Zero represent untiled loops.
  • destinationTensors are the init values to use for the outer most loop.
  • mappingVector is the mapping attributes to use for loop construction. Can be empty.
  • tiledBodyFn is called to generated the loop body of the inner most loop. Returns the generated scf.forall loop on success.

Definition at line 637 of file TileUsingInterface.cpp.

Referenced by generateLoopNest().

◆ generateLoopNestUsingForallOp()

FailureOr< SmallVector< LoopLikeOpInterface > > generateLoopNestUsingForallOp ( RewriterBase & rewriter,
Location loc,
ArrayRef< Range > loopRanges,
ArrayRef< OpFoldResult > givenTileSizes,
ArrayRef< OpFoldResult > numThreads,
ArrayRef< Attribute > mappingVector,
ValueRange outerDestinationTensors,
GenerateTiledBodyFn tiledBodyFn )
static

Generate the tile-loop nest using scf.forall operation.

  • loopRanges specifies the lb, ub and step of the untiled iteration space.
  • giventileSizes is the tile sizes to use. Zero represent untiled loops.
  • outerDestinationTensors are the init values to use for the loop.
  • mappingVector is the mapping attributes to use for loop construction. Can be empty.
  • tiledBodyFn is called to generated the loop body of the inner most loop. Returns the generated scf.forall loop on success.

Definition at line 557 of file TileUsingInterface.cpp.

References mlir::Builder::getArrayAttr(), mlir::Builder::getIndexAttr(), getLoopBounds(), getTileOffsetAndSizesWithForAllOp(), mlir::isZeroInteger(), mlir::RewriterBase::notifyMatchFailure(), mlir::OpBuilder::setInsertionPoint(), and mlir::OpBuilder::setInsertionPointToEnd().

Referenced by generateLoopNest().

◆ generateLoopNestUsingForOp()

FailureOr< SmallVector< LoopLikeOpInterface > > generateLoopNestUsingForOp ( RewriterBase & rewriter,
Location loc,
ArrayRef< Range > loopRanges,
ArrayRef< OpFoldResult > givenTileSizes,
ValueRange outerDestinationTensors,
GenerateTiledBodyFn tiledBodyFn )
static

Generate the tile-loop nest using scf.for operation.

  • loopRanges specifies the lb, ub and step of the untiled iteration space.
  • givenTileSizes is the tile sizes to use. Zero represent untiled loops.
  • outerDestinationTensors are the init values to use for the outer most loop.
  • tiledBodyFn is called to generated the loop body of the inner most loop. Returns the generated scf.for loops on success.

Definition at line 388 of file TileUsingInterface.cpp.

References mlir::Builder::getIndexAttr(), getLoopBounds(), getTileOffsetAndSizes(), mlir::getValueOrCreateConstantIndexOp(), mlir::RewriterBase::notifyMatchFailure(), mlir::OpBuilder::setInsertionPointToEnd(), and success().

Referenced by generateLoopNest().

◆ getBoundedTileSize()

OpFoldResult getBoundedTileSize ( OpBuilder & b,
Location loc,
Range loopRange,
OpFoldResult offset,
OpFoldResult givenTileSize )
static

Returns the bounded tile size given the current offset, loopRange and tileSize, i.e., min(tileSize, range.end() - offset).

Definition at line 237 of file TileUsingInterface.cpp.

References b, mlir::bindDims(), mlir::bindSymbols(), mlir::AffineMap::get(), mlir::getConstantIntValue(), mlir::getValueOrCreateConstantIndexOp(), mlir::affine::makeComposedFoldedAffineMin(), mlir::Range::offset, mlir::Range::size, and tileDividesIterationDomain().

Referenced by getTileOffsetAndSizes().

◆ getConsumerFromLoopUses()

FailureOr< OpOperand * > getConsumerFromLoopUses ( RewriterBase & rewriter,
Operation * loopOp,
unsigned resultNumber )
static

Fetches the OpOperand of the first valid user (and use) of the value val which implements TilingInterface and DestinationStyleOpInterface.

Returns failure otherwise.

Definition at line 1999 of file TileUsingInterface.cpp.

References checkAssumptionForLoop(), mlir::Operation::getBlock(), getFirstUserOfLoop(), mlir::Operation::getResult(), mlir::Value::getUses(), mlir::RewriterBase::moveOpBefore(), mlir::topologicalSort(), and mlir::Operation::use_empty().

Referenced by getUntiledConsumerFromSlice(), and getUntiledConsumerFromSlice().

◆ getFirstUserOfLoop()

FailureOr< Operation * > getFirstUserOfLoop ( Operation * loopOp)
static

An utility to get the first user of the given loopOp.

If any of user stay in different block of loopOp, return failure.

Definition at line 1881 of file TileUsingInterface.cpp.

References mlir::Operation::getBlock(), mlir::Operation::getUsers(), and mlir::Operation::isBeforeInBlock().

Referenced by checkAssumptionForLoop(), and getConsumerFromLoopUses().

◆ getLoopBounds()

std::tuple< SmallVector< OpFoldResult >, SmallVector< OpFoldResult >, SmallVector< OpFoldResult > > getLoopBounds ( RewriterBase & rewriter,
Location loc,
ArrayRef< Range > loopRanges,
ArrayRef< OpFoldResult > givenTileSizes )
static

Function to return the bounds of the loops to be generated.

Definition at line 306 of file TileUsingInterface.cpp.

References mlir::isZeroInteger().

Referenced by generateLoopNestUsingForallOp(), and generateLoopNestUsingForOp().

◆ getResultTilePosition()

LogicalResult getResultTilePosition ( RewriterBase & rewriter,
ReductionTilingStrategy reductionStrategy,
int64_t index,
Value tiledResult,
TilingInterface op,
ArrayRef< OpFoldResult > offsets,
ArrayRef< OpFoldResult > sizes,
ValueRange ivs,
ArrayRef< OpFoldResult > numThreads,
ArrayRef< OpFoldResult > givenTileSizes,
const SetVector< unsigned > & reductionDims,
SmallVector< OpFoldResult > & resultOffset,
SmallVector< OpFoldResult > & resultSize )
static

◆ getSanitizedReductionDims()

SetVector< unsigned > getSanitizedReductionDims ( ArrayRef< OpFoldResult > givenTileSizes,
const scf::SCFTilingOptions & options )
static

Get the reduction dims that are tiled.

This accounts for reduction dims that are specified as tiled, but the tile size is 0.

Definition at line 210 of file TileUsingInterface.cpp.

References mlir::isConstantIntValue(), and options.

◆ getSplitReductionIvs()

SmallVector< OpFoldResult > getSplitReductionIvs ( RewriterBase & rewriter,
Location loc,
ReductionTilingStrategy reductionStrategy,
ValueRange ivs,
ArrayRef< OpFoldResult > numThreads,
ArrayRef< OpFoldResult > givenTileSizes,
const SetVector< unsigned > & reductionDims )
static

For the case of ReductionTilingStrategy::PartialReductionOuterParallel the PartialReductionOpInterface methods need the index of the parallel split reduction being executed.

Definition at line 794 of file TileUsingInterface.cpp.

References mlir::bindSymbols(), mlir::AffineExpr::floorDiv(), mlir::Builder::getContext(), mlir::Builder::getIndexAttr(), mlir::affine::makeComposedFoldedAffineApply(), and mlir::PartialReductionOuterParallel.

Referenced by getResultTilePosition(), and getTiledImplementation().

◆ getTiledImplementation()

FailureOr< TilingResult > getTiledImplementation ( RewriterBase & rewriter,
TilingInterface op,
ReductionTilingStrategy reductionStrategy,
ValueRange regionIterArg,
ArrayRef< OpFoldResult > offsets,
ArrayRef< OpFoldResult > sizes,
ValueRange ivs,
ArrayRef< OpFoldResult > numThreads,
ArrayRef< OpFoldResult > givenTileSizes,
const SetVector< unsigned > & reductionDims )
static

◆ getTileOffsetAndSizes()

std::tuple< SmallVector< OpFoldResult >, SmallVector< OpFoldResult > > getTileOffsetAndSizes ( RewriterBase & rewriter,
Location loc,
ValueRange ivs,
ArrayRef< Range > iterationDomain,
ArrayRef< OpFoldResult > givenTileSizes )
static

Compute the OpFoldResults that represents the multi-dimensional offsets and sizes of the tile of the iteration space that the innermost loop body of the generated tiled loops corresponds to.

Definition at line 277 of file TileUsingInterface.cpp.

References mlir::getAsOpFoldResult(), getBoundedTileSize(), and mlir::isZeroInteger().

Referenced by generateLoopNestUsingForOp(), and getTileOffsetAndSizesWithForAllOp().

◆ getTileOffsetAndSizesWithForAllOp()

std::tuple< SmallVector< OpFoldResult >, SmallVector< OpFoldResult > > getTileOffsetAndSizesWithForAllOp ( RewriterBase & rewriter,
Location loc,
ValueRange ivs,
ArrayRef< Range > iterationDomain,
ArrayRef< OpFoldResult > givenTileSizes,
ArrayRef< OpFoldResult > numThreads )
static

Compute the OpFoldResults that represents the multi-dimensional offsets and sizes of the tile of the iteration space that the innermost loop body of the generated tiled loops corresponds to when tiling using forall op.

This is handle separately due to the special case handling needed for when the tiling is done by specifying number of threads.

Definition at line 474 of file TileUsingInterface.cpp.

References mlir::bindDims(), mlir::bindSymbols(), canOmitTileOffsetInBoundsCheck(), mlir::Builder::getContext(), mlir::Builder::getIndexAttr(), mlir::AffineMap::getMultiDimIdentityMap(), getTileOffsetAndSizes(), mlir::isZeroInteger(), mlir::affine::makeComposedFoldedAffineApply(), mlir::affine::makeComposedFoldedAffineMax(), and mlir::affine::makeComposedFoldedAffineMin().

Referenced by generateLoopNestUsingForallOp().

◆ getUntiledConsumerFromSlice() [1/2]

FailureOr< OpOperand * > getUntiledConsumerFromSlice ( RewriterBase & rewriter,
tensor::InsertSliceOp candidateSliceOp,
MutableArrayRef< LoopLikeOpInterface > loops )
static

Fetch the untiled consumer of the outermost scf.for's result which is yielded by a tensor.insert_slice from the innermost scf.for.

This function makes the following assumptions :

  1. tensor.insert_slice has scf.yield as its only user.
  2. scf.for's corresponding result has only one use.
  3. The loops passed in are perfectly nested scf.for operations.

Definition at line 2050 of file TileUsingInterface.cpp.

References checkAssumptionForFusingConsumer(), getConsumerFromLoopUses(), mlir::OpOperand::getOperandNumber(), mlir::Operation::getParentOp(), mlir::Value::getUses(), mlir::isPerfectlyNestedForLoops(), and mlir::RewriterBase::notifyMatchFailure().

Referenced by getUntiledConsumerOperandsFromSlices().

◆ getUntiledConsumerFromSlice() [2/2]

FailureOr< OpOperand * > getUntiledConsumerFromSlice ( RewriterBase & rewriter,
tensor::ParallelInsertSliceOp candidateSliceOp,
MutableArrayRef< LoopLikeOpInterface > loops )
static

Fetch the first untiled consumer of a scf.forall's result which is yielded by a tensor.parallel_insert_slice.

Definition at line 2084 of file TileUsingInterface.cpp.

References getConsumerFromLoopUses(), and mlir::RewriterBase::notifyMatchFailure().

◆ getUntiledConsumerOperandsFromSlices()

FailureOr< SmallVector< OpOperand * > > getUntiledConsumerOperandsFromSlices ( RewriterBase & rewriter,
ArrayRef< Operation * > sliceOps,
MutableArrayRef< LoopLikeOpInterface > loops )
static

A utility to fetch an untiled consumer of tensor.insert_slice/tensor.parallel_insert_slice.

Definition at line 2116 of file TileUsingInterface.cpp.

References getUntiledConsumerFromSlice(), and mlir::RewriterBase::notifyMatchFailure().

◆ getUntiledProducerFromSliceSource()

std::tuple< OpResult, std::optional< OpOperand * > > getUntiledProducerFromSliceSource ( OpOperand * source,
ArrayRef< LoopLikeOpInterface > loops )
static

Return the untiled producer whose slice is used in a tiled consumer.

The method traverses the tile loop nest (loops) if needed, and returns the iter_args of the outer most that is encountered. Traversing the iter_args indicates that this is a destination operand of the consumer. If there was no loop traversal needed, the second value of the returned tuple is empty.

Definition at line 1315 of file TileUsingInterface.cpp.

References mlir::IROperand< DerivedT, IRValueT >::get().

◆ getUserTileSizesAndNumThreads()

std::tuple< SmallVector< OpFoldResult >, SmallVector< OpFoldResult > > getUserTileSizesAndNumThreads ( RewriterBase & rewriter,
TilingInterface op,
ArrayRef< Range > iterationDomain,
const scf::SCFTilingOptions & options )
static

Method to instantiate the tile sizes and/or number of threads specified by the user.

Definition at line 101 of file TileUsingInterface.cpp.

References mlir::bindSymbols(), mlir::AffineExpr::ceilDiv(), mlir::Builder::getContext(), mlir::Builder::getIndexAttr(), mlir::isZeroInteger(), mlir::affine::makeComposedFoldedAffineApply(), and options.

◆ mergeTilingResults()

FailureOr< MergeResult > mergeTilingResults ( RewriterBase & rewriter,
TilingInterface op,
ReductionTilingStrategy reductionStrategy,
const SetVector< unsigned > & reductionDims,
ValueRange partialResults )
static

◆ tileDividesIterationDomain()

bool tileDividesIterationDomain ( Range loopRange)
static

Check if stride evenly divides the trip count size - offset.

Definition at line 222 of file TileUsingInterface.cpp.

References mlir::getConstantIntValue(), mlir::Range::offset, mlir::Range::size, and mlir::Range::stride.

Referenced by getBoundedTileSize().

◆ verifyOptions()

LogicalResult verifyOptions ( RewriterBase & rewriter,
Location loc,
const scf::SCFTilingOptions & options )
static

Verify the tile size options are set in a consistent manner.

Definition at line 77 of file TileUsingInterface.cpp.

References mlir::isPermutationVector(), mlir::RewriterBase::notifyMatchFailure(), options, and success().

◆ yieldTiledValuesAndReplaceLoop() [1/2]

FailureOr< LoopLikeOpInterface > yieldTiledValuesAndReplaceLoop ( LoopLikeOpInterface loopLikeOp,
RewriterBase & rewriter,
ValueRange newInitOperands,
YieldTiledValuesFn yieldTiledValuesFn )

Implementation of yieldTiledValuesAndReplaceLoop for LoopLikeOpInterface, that just dispatches to the implementation for each supported loop type.

Definition at line 1021 of file TileUsingInterface.cpp.

References mlir::RewriterBase::notifyMatchFailure(), and yieldTiledValuesAndReplaceLoop().

◆ yieldTiledValuesAndReplaceLoop() [2/2]

template<typename LoopType>
FailureOr< LoopLikeOpInterface > yieldTiledValuesAndReplaceLoop ( LoopType loopOp,
RewriterBase & rewriter,
ValueRange newInitOperands,
YieldTiledValuesFn yieldTiledValuesFn )

Append the specified additional newInitOperands operands to the loops existing init operands (or similar), and replace loopOp with the new loop that has the additional init operands.

The loop body of this loop is moved over to the new loop. yieldTiledValuesFn is called to get the new tiled values returned, and the offset and sizes at which the tiled value is inserted into the new region iter_args that correspond to the newly added init operands.

Definition at line 904 of file TileUsingInterface.cpp.

References mlir::RewriterBase::notifyMatchFailure().

Referenced by addInitOperandsToLoopNest(), and yieldTiledValuesAndReplaceLoop().

◆ yieldTiledValuesAndReplaceLoop< scf::ForallOp >()

template<>
FailureOr< LoopLikeOpInterface > yieldTiledValuesAndReplaceLoop< scf::ForallOp > ( scf::ForallOp loopOp,
RewriterBase & rewriter,
ValueRange newInitOperands,
YieldTiledValuesFn yieldTiledValuesFn )

Implementation of yieldTiledValuesAndReplaceLoop for scf.forall

Definition at line 904 of file TileUsingInterface.cpp.

◆ yieldTiledValuesAndReplaceLoop< scf::ForOp >()

template<>
FailureOr< LoopLikeOpInterface > yieldTiledValuesAndReplaceLoop< scf::ForOp > ( scf::ForOp loopOp,
RewriterBase & rewriter,
ValueRange newInitOperands,
YieldTiledValuesFn yieldTiledValuesFn )

Implementation of yieldTiledValuesAndReplaceLoop for scf.for.

Definition at line 904 of file TileUsingInterface.cpp.