MLIR  17.0.0git
Namespaces | Classes | Typedefs | Enumerations | Functions
mlir::linalg Namespace Reference

Namespaces

 detail
 

Classes

class  LinalgOpToLibraryCallRewrite
 
struct  LinalgTilingOptions
 
struct  LinalgTilingAndFusionOptions
 
struct  LinalgPaddingOptions
 
struct  LinalgPromotionOptions
 
struct  SplitReductionOptions
 Split Reduction options. More...
 
struct  ElementwiseOpFusionResult
 Fuse two linalg.generic operations that have a producer-consumer relationship captured through fusedOperand. More...
 
struct  TiledLinalgOp
 Perform standalone tiling of a single LinalgOp by tileSizes. More...
 
struct  PromotionInfo
 Create a new buffer using the allocationFn provided. More...
 
struct  MultiSizeSpecification
 A description of a multi-size tiling comprising tile sizes and numbers of tiles, expressed as Values which may or may not be constant. More...
 
struct  StaticMultiSizeSpecification
 
struct  ForallTilingResult
 Rewrite a TilingInterface op to a tiled scf.forall, applying tiling by numThreads. More...
 
struct  ForallReductionTilingResult
 Transformation information returned after reduction tiling. More...
 
struct  SplitReductionResult
 Apply transformation to split the single linalg op reduction into a parallel and reduction dimension. More...
 
struct  LowerPackResult
 
struct  LowerUnPackOpResult
 
struct  PackResult
 Struct to hold the result of a pack call. More...
 
struct  PackTransposeResult
 Struct to hold the result of a packTranspose call. More...
 
struct  LinalgPaddingPattern
 Linalg padding pattern. More...
 
struct  DownscaleSizeOneWindowed2DConvolution
 Rewrites 2-D convolution ops with size-1 window dimensions into 1-D convolution ops. More...
 
struct  DownscaleDepthwiseConv2DNhwcHwcOp
 Rewrites 2-D depthwise convolution ops with size-1 (w, kw) or (h, kh) dimensions into 1-D depthwise convolution ops. More...
 
struct  DownscaleConv2DOp
 
struct  LinalgGeneralizationPattern
 Linalg generalization pattern. More...
 
struct  CopyVectorizationPattern
 Vectorization pattern for memref::CopyOp. More...
 
struct  PadOpTransformationPattern
 tensor::PadOp is not canonicalized away yet, so we provide a transformation to linalg.generic. More...
 
struct  GeneralizePadOpPattern
 Rewrite a tensor::PadOp into a sequence of EmptyOp, FillOp and InsertSliceOp. More...
 
struct  GeneralizeOuterUnitDimsPackOpPattern
 Rewrites a tensor::PackOp into a sequence of tensor.pad + linalg.transpose + tensor.insert_slice ops, where the tensor::PackOp has outer dims being all 1s. More...
 
struct  GeneralizeOuterUnitDimsUnPackOpPattern
 Rewrites a tensor::UnPackOp into a sequence of rank-reduced extract_slice op. More...
 
struct  LinalgCopyVTRForwardingPattern
 Match and rewrite for the pattern: More...
 
struct  LinalgCopyVTWForwardingPattern
 Match and rewrite for the pattern: More...
 
struct  ExtractSliceOfPadTensorSwapPattern
 Rewrite extract_slice(tensor.pad(x)) into tensor.pad(extract_slice(x)). More...
 
struct  EmbeddedMatmulDimsCandidates
 Possible dimension candidates that define a matmul embedded in the indexing maps of a LinalgOp. More...
 
struct  SliceParameters
 A struct containg offsets-sizes-strides arguments of the tiled shape. More...
 
struct  FusionInfo
 A struct containing the Linalg producer before and after fusion. More...
 
struct  ProcInfo
 Callback function type used to get processor ID, and number of processors used for distribution for all parallel loops generated. More...
 
struct  LinalgLoopDistributionOptions
 Options that allow distribution of loops generated in Linalg transforms to processors while generating the loops. More...
 
struct  RegionMatcher
 A struct containing common matchers over linalg op's region. More...
 
struct  GenerateLoopNest
 Utility class used to generate nested loops with ranges described by loopRanges and loop type described by the iteratorTypes. More...
 

Typedefs

using TileSizeComputationFunction = std::function< SmallVector< Value, 4 >(OpBuilder &, Operation *)>
 
using AllocBufferCallbackFn = std::function< std::optional< Value >(OpBuilder &b, memref::SubViewOp subView, ArrayRef< Value > boundingSubViewSize, DataLayout &layout)>
 Callback function type used to perform the allocation for the promoted subView. More...
 
using DeallocBufferCallbackFn = std::function< LogicalResult(OpBuilder &b, Value buffer)>
 Callback function type used to deallocate the buffers used to hold the promoted subview. More...
 
using CopyCallbackFn = std::function< LogicalResult(OpBuilder &b, Value src, Value dst)>
 Callback function type used to insert copy from original subview to subview of the promoted region for the read operands/subview of promoted region to original subview for the results. More...
 
using ControlSplitReductionFn = std::function< SplitReductionOptions(LinalgOp op)>
 Function signature to control reduction splitting. More...
 
using LinalgLoops = SmallVector< Operation *, 4 >
 
using LoopIndexToRangeIndexMap = DenseMap< int, int >
 Creates a number of ranges equal to the number of non-zero in tileSizes. More...
 
using OptimizeCopyFn = std::function< LogicalResult(RewriterBase &, tensor::PadOp, Value)>
 
using ControlFusionFn = std::function< bool(OpOperand *fusedOperand)>
 Function type which is used to control when to stop fusion. More...
 
using ControlPropagationFn = std::function< bool(Operation *op)>
 Function type which is used to control propagation of tensor.pack/unpack ops. More...
 
using GetCollapsableDimensionsFn = std::function< SmallVector< ReassociationIndices >(linalg::GenericOp)>
 Function type to control generic op dimension collapsing. More...
 
using ProcInfoCallBackFn = std::function< SmallVector< ProcInfo >(OpBuilder &b, Location loc, ArrayRef< Range > parallelLoopRanges)>
 

Enumerations

enum class  LinalgTilingLoopType { Loops = 0 , AffineLoops = 1 , ParallelLoops = 2 }
 The type of loops to be generated during tiling. More...
 
enum class  DistributionMethod { Cyclic = 0 , CyclicNumProcsGeNumIters = 1 , CyclicNumProcsEqNumIters = 2 , None = 3 }
 Scheme used to distribute loops to processors. More...
 

Functions

void populateLinalgToStandardConversionPatterns (RewritePatternSet &patterns)
 Populate the given list with patterns that convert from Linalg to Standard. More...
 
std::string generateLibraryCallName (Operation *op)
 Returns the name mangled library call name to disambiguate between different overloads at the C level. More...
 
SmallVector< AffineExpr, 4 > makeAffineDimExprs (unsigned num, unsigned &startIdx, MLIRContext *context)
 Returns num AffineDimExpr dimensions at positions [startIdx, startIdx + num) and increments startIdx to startIdx + num. More...
 
AffineMap extractOrIdentityMap (std::optional< AffineMap > maybeMap, unsigned rank, MLIRContext *context)
 Returns maybeMap.get() if maybeMap is set, otherwise returns the symbol-less identity map of rank. More...
 
SmallVector< AffineExpr, 4 > concat (ArrayRef< AffineExpr > a, ArrayRef< AffineExpr > b)
 Return the vector that is the concatenation of a and b. More...
 
bool isaContractionOpInterface (LinalgOp linalgOp)
 Checks whether linalgOp conforms to ContractionOpInterface. More...
 
void registerValueBoundsOpInterfaceExternalModels (DialectRegistry &registry)
 
void registerTransformDialectExtension (DialectRegistry &registry)
 
void registerBufferizableOpInterfaceExternalModels (DialectRegistry &registry)
 
void hoistRedundantVectorTransfers (func::FuncOp func)
 Hoist vector.transfer_read/vector.transfer_write on buffers pairs out of immediately enclosing scf::ForOp iteratively, if the following conditions are true: More...
 
scf::ForOp hoistRedundantSubsetExtractInsert (RewriterBase &rewriter, scf::ForOp forOp)
 Greedily hoist redundant subset extract/insert operations on tensors outside of forOp. More...
 
void hoistRedundantVectorTransfersOnTensor (func::FuncOp func)
 Call into hoistRedundantSubsetInsertExtract without a RewriterBase. More...
 
void registerTilingInterfaceExternalModels (DialectRegistry &registry)
 
std::optional< vector::CombiningKind > getCombinerOpKind (Operation *combinerOp)
 Return vector::CombiningKind for the given op. More...
 
bool areElementwiseOpsFusable (OpOperand *fusedOperand)
 Return true if two linalg.generic operations with producer/consumer relationship through fusedOperand can be fused using elementwise op fusion. More...
 
LogicalResult promoteSubviewsPrecondition (Operation *op, LinalgPromotionOptions options)
 Promote memref.subviews feeding linalg-on-buffers operations. More...
 
LogicalResult vectorizeOpPrecondition (Operation *op, ArrayRef< int64_t > inputVectorSizes={}, bool vectorizeNDExtract=false)
 Return success if the operation can be vectorized. More...
 
Value bufferizeToAllocation (RewriterBase &rewriter, tensor::PadOp padOp, Attribute memorySpace={})
 Materialize a buffer allocation for the given tensor.pad op and lower the op to linalg.fill/linalg.generic + memref.tensor_store. More...
 
Value bufferizeToAllocation (RewriterBase &rewriter, Value value, Attribute memorySpace={})
 Materialize a buffer allocation for the given tensor value. More...
 
FailureOr< ElementwiseOpFusionResultfuseElementwiseOps (RewriterBase &rewriter, OpOperand *fusedOperand)
 
SmallVector< ValuepeelLoop (RewriterBase &rewriter, Operation *op)
 Try to peel and canonicalize loop op and return the new result. More...
 
void peelLoops (RewriterBase &rewriter, ArrayRef< scf::ForOp > loops)
 Peel 'loops' and applies affine_min/max bounds simplification on the fly where relevant. More...
 
FailureOr< SmallVector< Value > > rewriteAsPaddedOp (RewriterBase &rewriter, LinalgOp opToPad, ArrayRef< int64_t > paddingDimensions, ArrayRef< int64_t > padToMultipleOf, ArrayRef< Attribute > paddingValues, ArrayRef< bool > packPaddings, LinalgOp &paddedOp)
 Pad the iterator dimensions paddingDimensions of all opToPad operands to a static bounding box. More...
 
FailureOr< ValuehoistPaddingOnTensors (RewriterBase &rewriter, tensor::PadOp opToHoist, int64_t numLoops, ArrayRef< int64_t > transposeVector, tensor::PadOp &hoistedOp, SmallVectorImpl< GenericOp > &transposeOps)
 Mechanically hoist padding operations on tensors by numLoops into a new, generally larger tensor. More...
 
FailureOr< ValuehoistPaddingOnTensors (tensor::PadOp opToHoist, int64_t numLoops, ArrayRef< int64_t > transposeVector, tensor::PadOp &hoistedOp, SmallVectorImpl< GenericOp > &transposeOps)
 Calls into hoistPaddingOnTensors with a local IRRewriter. More...
 
FailureOr< LinalgOp > padAndHoistLinalgOp (RewriterBase &rewriter, LinalgOp linalgOp, LinalgPaddingOptions options)
 Apply padding and hoisting to linalgOp according to the configuration specified in options. More...
 
std::pair< TilingInterface, TilingInterface > splitOp (RewriterBase &rewriter, TilingInterface op, unsigned dimension, OpFoldResult splitPoint)
 Split the given op into two parts along the given iteration space dimension at the specified splitPoint, and return the two parts. More...
 
FailureOr< TiledLinalgOptileLinalgOp (RewriterBase &b, LinalgOp op, const LinalgTilingOptions &options)
 
FailureOr< GenericOp > interchangeGenericOp (RewriterBase &rewriter, GenericOp genericOp, ArrayRef< unsigned > interchangeVector)
 Interchange the iterator_types and iterator_maps dimensions and adapts the index accesses of op. More...
 
FailureOr< GenericOp > generalizeNamedOp (RewriterBase &rewriter, LinalgOp namedOp)
 Create a GenericOp from the given named operation namedOp and replace namedOp. More...
 
FailureOr< PromotionInfopromoteSubviewAsNewBuffer (OpBuilder &b, Location loc, memref::SubViewOp subView, const AllocBufferCallbackFn &allocationFn, DataLayout &layout)
 
FailureOr< LinalgOp > promoteSubViews (OpBuilder &b, LinalgOp op, const LinalgPromotionOptions &options)
 Promote the subViews into a new buffer allocated at the insertion point b. More...
 
std::optional< ValueallocateWorkgroupMemory (OpBuilder &builder, memref::SubViewOp subview, ArrayRef< Value > sizeBounds, DataLayout &)
 Allocate the subview in the GPU workgroup memory. More...
 
LogicalResult deallocateWorkgroupMemory (OpBuilder &, Value)
 In case of GPU group memory there is no need to deallocate. More...
 
LogicalResult copyToWorkgroupMemory (OpBuilder &b, Value src, Value dst)
 Create Memref copy operations and add gpu barrier guards before and after the copy operation to ensure data integrity. More...
 
std::optional< ValueallocateGPUPrivateMemory (OpBuilder &builder, memref::SubViewOp subview, ArrayRef< Value > sizeBounds, DataLayout &)
 Allocate the subview in the GPU private memory. More...
 
LogicalResult copyToGPUPrivateMemory (OpBuilder &b, Value src, Value dst)
 Normal copy to between src and dst. More...
 
LogicalResult deallocateGPUPrivateMemory (OpBuilder &, Value)
 In case of GPU private memory there is no need to deallocate since the memory is freed when going outside of the scope. More...
 
LogicalResult vectorize (RewriterBase &rewriter, Operation *op, ArrayRef< int64_t > inputVectorSizes={}, bool vectorizeNDExtract=false, bool lastVectorSizeScalable=false)
 Emit a suitable vector form for an operation. More...
 
LogicalResult vectorizeCopy (RewriterBase &builder, memref::CopyOp copyOp)
 Emit a suitable vector form for a Copy op with fully static shape. More...
 
FailureOr< LinalgLoopslinalgOpToLoops (RewriterBase &rewriter, LinalgOp linalgOp)
 Emit a loop nest of scf.for with the proper body for linalgOp. More...
 
FailureOr< LinalgLoopslinalgOpToParallelLoops (RewriterBase &rewriter, LinalgOp linalgOp)
 Emit a loop nest of scf.parallel with the proper body for linalgOp. More...
 
FailureOr< LinalgLoopslinalgOpToAffineLoops (RewriterBase &rewriter, LinalgOp linalgOp)
 Emit a loop nest of affine.for with the proper body for linalgOp. More...
 
std::tuple< SmallVector< Range, 4 >, LoopIndexToRangeIndexMapmakeTiledLoopRanges (RewriterBase &b, Location loc, AffineMap map, ArrayRef< OpFoldResult > allShapeSizes, ArrayRef< OpFoldResult > allTileSizes)
 
FailureOr< MultiSizeSpecificationcomputeMultiTileSizes (OpBuilder &builder, LinalgOp op, unsigned dimension, OpFoldResult targetSize, OpFoldResult divisor, bool emitAssertions=true)
 Emits the IR computing the multi-sized tiling specification with two tile sizes not exceeding targetSize, each divisible by sizeDivisor, such that there exist numbers of tiles with these sizes that fully cover the given iteration space dimension of the structured op. More...
 
FailureOr< StaticMultiSizeSpecificationcomputeStaticMultiTileSizes (LinalgOp op, unsigned dimension, int64_t targetSize, int64_t divisor)
 
FailureOr< ForallTilingResulttileToForallOp (RewriterBase &builder, TilingInterface op, ArrayRef< OpFoldResult > numThreads, std::optional< ArrayAttr > mapping)
 
FailureOr< ForallTilingResulttileToForallOpUsingTileSizes (RewriterBase &builder, TilingInterface op, ArrayRef< OpFoldResult > tileSizes, std::optional< ArrayAttr > mapping)
 Same as tileToForallOp, but calculate the number of threads required using the given tileSizes. More...
 
FailureOr< ForallReductionTilingResulttileReductionUsingForall (RewriterBase &b, PartialReductionOpInterface op, ArrayRef< OpFoldResult > numThreads, ArrayRef< OpFoldResult > tileSizes={}, std::optional< ArrayAttr > mapping=std::nullopt)
 Method to tile a reduction to parallel iterations computing partial reductions. More...
 
void transformIndexOps (RewriterBase &b, LinalgOp op, SmallVectorImpl< Value > &ivs, const LoopIndexToRangeIndexMap &loopIndexToRangeIndex)
 All indices returned by IndexOp should be invariant with respect to tiling. More...
 
FailureOr< SplitReductionResultsplitReduction (RewriterBase &b, LinalgOp op, const ControlSplitReductionFn &controlSplitReductionFn, bool useAlloc=false)
 
FailureOr< SplitReductionResultsplitReductionByScaling (RewriterBase &b, LinalgOp op, const ControlSplitReductionFn &controlSplitReductionFn, bool useAlloc=false)
 Scaling-based implementation of the split reduction transformation. More...
 
bool isDimSequencePreserved (AffineMap map, ReassociationIndicesRef dimSequence)
 Return true if a given sequence of dimensions are contiguous in the range of the specified indexing map. More...
 
bool areDimSequencesPreserved (ArrayRef< AffineMap > maps, ArrayRef< ReassociationIndices > dimSequences)
 Return true if all sequences of dimensions specified in dimSequences are contiguous in all the ranges of the maps. More...
 
FailureOr< SmallVector< Value > > collapseGenericOpIterationDims (GenericOp genericOp, ArrayRef< ReassociationIndices > foldedIterationDims, RewriterBase &rewriter)
 Collapses dimensions of linalg.generic operation. More...
 
FailureOr< LowerPackResultlowerPack (RewriterBase &rewriter, tensor::PackOp packOp)
 Rewrite pack as pad + reshape + transpose. More...
 
FailureOr< LowerUnPackOpResultlowerUnPack (RewriterBase &rewriter, tensor::UnPackOp unPackOp)
 Rewrite pack as empty + transpose + reshape + extract_slice. More...
 
FailureOr< PackResultpack (RewriterBase &rewriter, linalg::LinalgOp linalgOp, ArrayRef< OpFoldResult > packedSizes)
 Implement packing of a single LinalgOp by packedSizes. More...
 
FailureOr< PackTransposeResultpackTranspose (RewriterBase &rewriter, tensor::PackOp packOp, linalg::LinalgOp linalgOp, tensor::UnPackOp maybeUnPackOp, ArrayRef< int64_t > outerPerm, ArrayRef< int64_t > innerPerm)
 Transpose a single PackOp -> LinalgOp -> UnPackOp chain and return the transposed PackOp -> LinalgOp -> UnPackOp chain after replacements. More...
 
FailureOr< Operation * > rewriteInDestinationPassingStyle (RewriterBase &rewriter, tensor::FromElementsOp fromElementsOp)
 Rewrite tensor.from_elements to linalg.generic. More...
 
FailureOr< Operation * > rewriteInDestinationPassingStyle (RewriterBase &rewriter, tensor::GenerateOp generateOp)
 Rewrite tensor.generate to linalg.generic. More...
 
FailureOr< Operation * > rewriteInDestinationPassingStyle (RewriterBase &rewriter, tensor::PadOp padOp)
 Rewrite tensor.pad to linalg.generic + tensor.insert_slice. More...
 
FailureOr< std::pair< Operation *, Operation * > > rewriteInIm2Col (RewriterBase &rewriter, linalg::Conv2DNhwcHwcfOp convOp)
 Convert linalg.conv_2d_nhwc_hwcf into linalg.generic (for img2col packing) and linalg.matmul. More...
 
FailureOr< std::pair< Operation *, Operation * > > rewriteInIm2Col (RewriterBase &rewriter, linalg::DepthwiseConv2DNhwcHwcOp convOp)
 Similar to rewriteInIm2Col with linalg::Conv2DNhwcHwcfOp except there is no reduction among the input channels so each convolution can be a matrix-vector product and by transposing both input filter so channels are outer most the computation is a batched matrix-vector product. More...
 
FailureOr< std::pair< Operation *, Operation * > > rewriteInIm2Col (RewriterBase &rewriter, linalg::Conv2DNchwFchwOp convOp)
 Similar to rewriteInIm2Col with linalg::Conv2DNhwcHwcfOp except because the channels are to the left of the image shape dimensions, the position of the contraction dimension in the resulting matmul is reversed. More...
 
RewritePatternSet getLinalgTilingCanonicalizationPatterns (MLIRContext *ctx)
 Canonicalization patterns relevant to apply after tiling patterns. More...
 
void populateLinalgTilingCanonicalizationPatterns (RewritePatternSet &patterns)
 
void populateLinalgNamedOpsGeneralizationPatterns (RewritePatternSet &patterns)
 Linalg generalization patterns. More...
 
void populateDecomposeConvolutionPatterns (RewritePatternSet &patterns, PatternBenefit benefit=1)
 Linalg decompose convolutions patterns. More...
 
void populateConvertConv2DToImg2ColPatterns (RewritePatternSet &patterns)
 Populates patterns to transform linalg.conv_2d_xxx operations into linalg.generic (for img2col packing) and linalg.matmul. More...
 
void populatePadOpVectorizationPatterns (RewritePatternSet &patterns, PatternBenefit baseBenefit=1)
 Populates patterns with patterns that vectorize tensor.pad. More...
 
void populateExtractOpVectorizationPatterns (RewritePatternSet &patterns, PatternBenefit baseBenefit=1)
 
void populateDecomposeLinalgOpsPattern (RewritePatternSet &patterns, bool removeDeadArgsAndResults=true)
 Populate patterns for splitting a LinalgOp with multiple statements within its payload into multiple GenericOp that have a single statement. More...
 
void populateConvertToDestinationStylePatterns (RewritePatternSet &patterns)
 Populate patterns that convert non-destination-style ops to destination style ops. More...
 
void populateConvolutionVectorizationPatterns (RewritePatternSet &patterns, PatternBenefit benefit=1)
 Populate patterns for vectorizing low-D convolution ops. More...
 
void populateElementwiseToLinalgConversionPatterns (RewritePatternSet &patterns)
 Populate patterns that convert ElementwiseMappable ops to linalg parallel loops. More...
 
void populateSparseTensorRewriting (RewritePatternSet &patterns)
 Populate patterns that are only useful in the context of sparse tensors. More...
 
void populateElementwiseOpsFusionPatterns (RewritePatternSet &patterns, const ControlFusionFn &controlElementwiseOpFusion)
 Patterns for fusing linalg operation on tensors. More...
 
void populateDataLayoutPropagationPatterns (RewritePatternSet &patterns, const ControlPropagationFn &controlPackUnPackPropagation)
 Patterns to bubble up or down data layout ops across other operations. More...
 
void populateEraseUnusedOperandsAndResultsPatterns (RewritePatternSet &patterns)
 Pattern to remove dead operands and results of linalg.generic operations. More...
 
void populateEraseUnnecessaryInputsPatterns (RewritePatternSet &patterns)
 Patterns to promote inputs to outputs and remove unused inputs of linalg.generic ops. More...
 
void populateCollapseDimensions (RewritePatternSet &patterns, const GetCollapsableDimensionsFn &controlCollapseDimensions)
 Pattern to collapse dimensions in a linalg.generic op. More...
 
void populateFoldReshapeOpsByExpansionPatterns (RewritePatternSet &patterns, const ControlFusionFn &controlFoldingReshapes)
 Patterns to fold an expanding (collapsing) tensor_reshape operation with its producer (consumer) generic operation by expanding the dimensionality of the loop in the generic op. More...
 
void populateFoldReshapeOpsByCollapsingPatterns (RewritePatternSet &patterns, const ControlFusionFn &controlFoldingReshapes)
 Patterns to fold an expanding tensor.expand_shape operation with its producer generic operation by collapsing the dimensions of the generic op. More...
 
void populateConstantFoldLinalgOperations (RewritePatternSet &patterns, const ControlFusionFn &controlFn)
 Patterns to constant fold Linalg operations. More...
 
void populateFuseTensorPadWithProducerLinalgOpPatterns (RewritePatternSet &patterns)
 Pattern to fuse a tensor.pad operation with the producer of its source, if the producer is a linalg operation with all parallel iterator types. More...
 
void populateLinalgNamedOpConversionPatterns (RewritePatternSet &patterns)
 Patterns to convert from one named op to another. More...
 
void populateFoldUnitExtentDimsViaReshapesPatterns (RewritePatternSet &patterns)
 Patterns to fold unit-extent dimensions in operands/results of linalg ops on tensors via reassociative reshape ops. More...
 
void populateFoldUnitExtentDimsViaSlicesPatterns (RewritePatternSet &patterns)
 Patterns to fold unit-extent dimensions in operands/results of linalg ops on tensors via rank-reducing slices. More...
 
void populateMoveInitOperandsToInputPattern (RewritePatternSet &patterns)
 A pattern that converts init operands to input operands. More...
 
void populateInlineConstantOperandsPatterns (RewritePatternSet &patterns)
 Patterns that are used to inline constant operands into linalg generic ops. More...
 
void populateBubbleUpExtractSliceOpPatterns (RewritePatternSet &patterns)
 Patterns that are used to bubble up extract slice op above linalg op. More...
 
void populateSwapExtractSliceWithFillPatterns (RewritePatternSet &patterns)
 Adds patterns that waps tensor.extract_slice(linalg.fill(cst, init)) into linalg.fill(cst, tensor.extract_slice(init)). More...
 
void populateSplitReductionPattern (RewritePatternSet &patterns, const ControlSplitReductionFn &controlSplitReductionFn, bool useAlloc=false)
 Patterns to apply splitReduction below. More...
 
Value createOrFoldDimOp (OpBuilder &b, Location loc, Value val, int64_t dim)
 Create one memref::DimOp or tensor::DimOp depending on the type of val. More...
 
OpFoldResult createFoldedDimOp (OpBuilder &b, Location loc, Value val, int64_t dim)
 Create one memref::DimOp or tensor::DimOp depending on the type of val. More...
 
SmallVector< ValuecreateDynamicDimensions (OpBuilder &b, Location loc, Value val)
 Build the list of DimOp for the dynamic dimensions of val. More...
 
SmallVector< OpFoldResultgetMixedDimensions (OpBuilder &b, Location loc, Value val)
 Build the list of all dimensions for val, mixing static attributes and dynamic values where appropriate. More...
 
DenseSet< int64_t > findPermutationsIndexingOperand (LinalgOp linalgOp, OpOperand *opOperand, utils::IteratorType iter)
 Given a linalgOp and one of its opOperand, returns the positions of the iterators of type iter that index the opOperand as a permutation. More...
 
bool containsMostMinorMatmul (linalg::LinalgOp linalgOp)
 Return true if linalgOp contains an embedded matmul subcomputation in its most minor dimensions. More...
 
FailureOr< EmbeddedMatmulDimsCandidatesinferMatmulDims (linalg::LinalgOp linalgOp)
 Find 2 parallel (m and n) and 1 reduction (k) dimension candidates that form a matmul subcomputation within linalgOp. More...
 
bool allIndexingsAreProjectedPermutation (LinalgOp op)
 Check if all indexing maps are projected permutations. More...
 
bool hasOnlyScalarElementwiseOp (Region &r)
 Detect whether r has only ConstantOp, ElementwiseMappable and YieldOp. More...
 
bool isElementwise (LinalgOp op)
 Check if a LinalgOp is an element-wise operation. More...
 
bool isParallelIterator (utils::IteratorType iteratorType)
 Check if iterator type has "parallel" semantics. More...
 
bool isReductionIterator (utils::IteratorType iteratorType)
 Check if iterator type has "reduction" semantics. More...
 
Value makeComposedPadHighOp (OpBuilder &b, Location loc, RankedTensorType type, Value source, Value pad, bool nofold)
 Create a tensor::PadOp that pads source to the size of the statically sized type whose static sizes are assumed to be greater than the dynamic source size. More...
 
GenericOp makeTransposeOp (OpBuilder &b, Location loc, Value inputTensor, Value outputTensor, ArrayRef< int64_t > transposeVector)
 Returns a GenericOp that transposes inputTensor into outputTensor using transposeVector to permute the inputTensor dimensions. More...
 
GenericOp makeMemRefCopyOp (OpBuilder &b, Location loc, Value from, Value to)
 Returns GenericOp that copies an n-D memref. More...
 
std::optional< SmallVector< ReassociationIndices > > getReassociationMapForFoldingUnitDims (ArrayRef< OpFoldResult > mixedSizes)
 Get the reassociation maps to fold the result of a extract_slice (or source of a insert_slice) operation with given offsets, and sizes to its rank-reduced version. More...
 
std::optional< TypedAttr > getNeutralElement (Operation *op)
 Return the identity numeric value associated to the give op. More...
 
SmallVector< OpFoldResultcomputeTileOffsets (OpBuilder &b, Location loc, ArrayRef< OpFoldResult > ivs, ArrayRef< OpFoldResult > tileSizes)
 Computes tile offsets, given a list of loop ivs and tileSizes. More...
 
SmallVector< OpFoldResultcomputeTileSizes (OpBuilder &b, Location loc, ArrayRef< OpFoldResult > tileSizes, ArrayRef< OpFoldResult > sizeBounds)
 Computes tile sizes, given a list of tileSizes and dimension sizes (sizeBounds). More...
 
SmallVector< TypegetTensorOutputTypes (LinalgOp op, ValueRange operands)
 Returns the list of tensor output types produced when the given structured operation op is applied to the given operands. More...
 
SmallVector< ValueinsertSlicesBack (OpBuilder &builder, Location loc, LinalgOp op, ValueRange operands, ValueRange results)
 Creates insert_slice ops that insert results back into larger tensors they were originally extracted from with extract_slice before being passed as operands to the given structured operation op or its clone. More...
 
SliceParameters computeSliceParameters (OpBuilder &builder, Location loc, Value valueToTile, ArrayRef< OpFoldResult > tileSizes, AffineMap map, ArrayRef< OpFoldResult > lbs, ArrayRef< OpFoldResult > ubs, ArrayRef< OpFoldResult > subShapeSizes, bool omitPartialTileCheck)
 Computes SliceParameters for a single valueToTile assuming that its user is being tiled with the given loop bounds lbs and ubs and the tile sizes tileSizes. More...
 
SmallVector< std::optional< SliceParameters > > computeAllSliceParameters (OpBuilder &builder, Location loc, LinalgOp linalgOp, ValueRange valuesToTile, ArrayRef< OpFoldResult > ivs, ArrayRef< OpFoldResult > tileSizes, ArrayRef< OpFoldResult > sizeBounds, bool omitPartialTileCheck)
 Computes SliceParamaters for all valuesToTile of the given linalgOp, assuming linalgOp is being fused into a loop nest. More...
 
Value makeTiledShape (OpBuilder &builder, Location loc, Value valueToTile, ArrayRef< OpFoldResult > tileSizes, AffineMap map, ArrayRef< OpFoldResult > lbs, ArrayRef< OpFoldResult > ubs, ArrayRef< OpFoldResult > subShapeSizes, bool omitPartialTileCheck)
 Creates an extract_slice/subview op for a single valueToTile with builder. More...
 
SmallVector< ValuemakeTiledShapes (OpBuilder &builder, Location loc, LinalgOp linalgOp, ValueRange valuesToTile, ArrayRef< OpFoldResult > ivs, ArrayRef< OpFoldResult > tileSizes, ArrayRef< OpFoldResult > sizeBounds, bool omitPartialTileCheck)
 Creates extract_slice/subview ops for all valuesToTile of the given linalgOp with builder, assuming linalgOp is being fused into a loop nest for tiling with the given induction variables ivs and tile sizes tileSizes. More...
 
void offsetIndices (OpBuilder &b, LinalgOp linalgOp, ArrayRef< OpFoldResult > offests)
 Add the specified offsets to any linalg.index ops contained in the given linalgOp. More...
 
void offsetIndices (RewriterBase &b, LinalgOp linalgOp, ArrayRef< OpFoldResult > offests)
 
FailureOr< FusionInfofuseProducerOfTensor (OpBuilder &b, OpOperand &consumerOpOperand)
 Tensor counterpart of fuseProducerOfBuffer. More...
 
FailureOr< FusionInfofuseProducerOfTensor (OpBuilder &b, OpResult producerOpResult, OpOperand &consumerOpOperand)
 Tensor counterpart of fuseProducerOfBuffer. More...
 
void updateBoundsForCyclicDistribution (OpBuilder &builder, Location loc, Value procId, Value nprocs, Value &lb, Value &ub, Value &step)
 Update the lb, ub and step to get per processor lb, ub and step. More...
 
template<typename OpTy >
SmallVector< NamedAttributegetPrunedAttributeList (OpTy op)
 Returns an attribute list that excludes pre-defined attributes. More...
 
static bool hasAllOneValues (DenseIntElementsAttr attr)
 
static Value createAdd (Location loc, Value x, Value y, OpBuilder &builder)
 
static Value createMul (Location loc, Value x, Value y, Type accType, OpBuilder &builder)
 
static SmallVector< ValueunrollIndex (OpBuilder &b, Location loc, Value index, ArrayRef< int64_t > factors)
 
static Value getConvolvedIndex (OpBuilder &b, Location loc, Value oIndex, Value fIndex, int64_t stride)
 
static void generateParallelLoopNest (OpBuilder &b, Location loc, ValueRange lbs, ValueRange ubs, ValueRange steps, ArrayRef< utils::IteratorType > iteratorTypes, ArrayRef< linalg::ProcInfo > procInfo, function_ref< void(OpBuilder &, Location, ValueRange)> bodyBuilderFn, SmallVectorImpl< Value > &ivStorage)
 Generates a loop nest consisting of scf.parallel and scf.for, depending on the iteratorTypes. More...
 
static Value materializeTiledShape (OpBuilder &builder, Location loc, Value valueToTile, const SliceParameters &sliceParams)
 

Typedef Documentation

◆ AllocBufferCallbackFn

using mlir::linalg::AllocBufferCallbackFn = typedef std::function<std::optional<Value>( OpBuilder &b, memref::SubViewOp subView, ArrayRef<Value> boundingSubViewSize, DataLayout &layout)>

Callback function type used to perform the allocation for the promoted subView.

In boundingSubViewsize a best attempt is made to find the smallest constant value for the size of the buffer needed for each dimension. If that is not possible, contains the dynamic size of the subview. The call back should return the buffer to use.

Definition at line 185 of file Transforms.h.

◆ ControlFusionFn

using mlir::linalg::ControlFusionFn = typedef std::function<bool(OpOperand *fusedOperand)>

Function type which is used to control when to stop fusion.

It is expected that OpOperand is not modified in the callback. The OpOperand is not marked as const to allow callers to use non-const methods.

Definition at line 1359 of file Transforms.h.

◆ ControlPropagationFn

using mlir::linalg::ControlPropagationFn = typedef std::function<bool(Operation *op)>

Function type which is used to control propagation of tensor.pack/unpack ops.

Definition at line 1371 of file Transforms.h.

◆ ControlSplitReductionFn

using mlir::linalg::ControlSplitReductionFn = typedef std::function<SplitReductionOptions(LinalgOp op)>

Function signature to control reduction splitting.

This returns SplitReductionOptions.

Definition at line 282 of file Transforms.h.

◆ CopyCallbackFn

using mlir::linalg::CopyCallbackFn = typedef std::function<LogicalResult(OpBuilder &b, Value src, Value dst)>

Callback function type used to insert copy from original subview to subview of the promoted region for the read operands/subview of promoted region to original subview for the results.

The copy has to happen from src to dst.

Definition at line 198 of file Transforms.h.

◆ DeallocBufferCallbackFn

using mlir::linalg::DeallocBufferCallbackFn = typedef std::function<LogicalResult(OpBuilder &b, Value buffer)>

Callback function type used to deallocate the buffers used to hold the promoted subview.

Definition at line 191 of file Transforms.h.

◆ GetCollapsableDimensionsFn

using mlir::linalg::GetCollapsableDimensionsFn = typedef std::function<SmallVector<ReassociationIndices>(linalg::GenericOp)>

Function type to control generic op dimension collapsing.

It is expected to return an array of ReassociationIndices representing dimensions that should be merged.

Definition at line 1389 of file Transforms.h.

◆ LinalgLoops

Definition at line 308 of file Transforms.h.

◆ LoopIndexToRangeIndexMap

Creates a number of ranges equal to the number of non-zero in tileSizes.

One for each loop of the LinalgOp that is tiled. The tileSizes argument has one entry per surrounding loop. It uses zero as the convention that a particular loop is not tiled. This convention simplifies implementations by avoiding affine map manipulations. The returned ranges correspond to the loop ranges, in the proper order, that are tiled and for which new loops will be created. Also the function returns a map from loop indices of the LinalgOp to the corresponding non-empty range indices of newly created loops.

Definition at line 622 of file Transforms.h.

◆ OptimizeCopyFn

using mlir::linalg::OptimizeCopyFn = typedef std::function<LogicalResult(RewriterBase &, tensor::PadOp, Value)>

Definition at line 1162 of file Transforms.h.

◆ ProcInfoCallBackFn

using mlir::linalg::ProcInfoCallBackFn = typedef std::function<SmallVector<ProcInfo>( OpBuilder &b, Location loc, ArrayRef<Range> parallelLoopRanges)>

Definition at line 339 of file Utils.h.

◆ TileSizeComputationFunction

Definition at line 45 of file Transforms.h.

Enumeration Type Documentation

◆ DistributionMethod

Scheme used to distribute loops to processors.

Enumerator
Cyclic 

Cyclic distribution where no assumption is made about the dynamic relationship between number of processors and number of iterations of the distributed loop.

Distributes the following loop

scf.parallel (iv) = (lb) to (ub) step (step)

to

scf.parallel(iv)= (lb + procId * step) to (ub) step (step * nprocs)

CyclicNumProcsGeNumIters 

Cyclic distribution where the number of processors can be assumed to be more than or equal to the number of iterations of the distributed loop.

In such cases, a simple in-bounds check is enough (instead of materializing a loop). Distributes the following loop

scf.parallel (iv) = (lb) to (ub) step (step)

to

iv = lb + procId * step cond = arith.cmpi "slt", iv, ub scf.if cond { ... }

CyclicNumProcsEqNumIters 

Cyclic distribution where the number of processors can be assumed to be equal to the number of iterations of the distributed loop.

In such cases, no bounds check is needed. Distributes the following loop

scf.parallel (iv) = (lb) to (ub) step (step)

to

iv = lb + procId * step

None 

No Distribution.

Definition at line 284 of file Utils.h.

◆ LinalgTilingLoopType

The type of loops to be generated during tiling.

Enumerator
Loops 
AffineLoops 
ParallelLoops 

Definition at line 142 of file Utils.h.

Function Documentation

◆ allIndexingsAreProjectedPermutation()

bool mlir::linalg::allIndexingsAreProjectedPermutation ( LinalgOp  op)

Check if all indexing maps are projected permutations.

Definition at line 228 of file Utils.cpp.

Referenced by vectorizeLinalgOpPrecondition().

◆ allocateGPUPrivateMemory()

std::optional< Value > mlir::linalg::allocateGPUPrivateMemory ( OpBuilder builder,
memref::SubViewOp  subview,
ArrayRef< Value sizeBounds,
DataLayout  
)

Allocate the subview in the GPU private memory.

Definition at line 469 of file Promotion.cpp.

References allocateSubviewGPUMemoryInAddressSpace().

◆ allocateWorkgroupMemory()

std::optional< Value > mlir::linalg::allocateWorkgroupMemory ( OpBuilder builder,
memref::SubViewOp  subview,
ArrayRef< Value sizeBounds,
DataLayout  
)

Allocate the subview in the GPU workgroup memory.

Definition at line 444 of file Promotion.cpp.

References allocateSubviewGPUMemoryInAddressSpace().

◆ areDimSequencesPreserved()

bool mlir::linalg::areDimSequencesPreserved ( ArrayRef< AffineMap maps,
ArrayRef< ReassociationIndices dimSequences 
)

Return true if all sequences of dimensions specified in dimSequences are contiguous in all the ranges of the maps.

Definition at line 1049 of file ElementwiseOpFusion.cpp.

References isDimSequencePreserved().

◆ areElementwiseOpsFusable()

bool mlir::linalg::areElementwiseOpsFusable ( OpOperand fusedOperand)

Return true if two linalg.generic operations with producer/consumer relationship through fusedOperand can be fused using elementwise op fusion.

Conditions for elementwise fusion of generic operations.

Definition at line 75 of file ElementwiseOpFusion.cpp.

References mlir::IROperand< DerivedT, IRValueT >::get(), mlir::Value::getDefiningOp(), getIndexingMapOfProducerOperandsInCoordinatesOfFusedOp(), mlir::AffineMap::getNumResults(), mlir::detail::IROperandBase::getOwner(), mlir::Value::getType(), and mlir::AffineMap::isPermutation().

Referenced by fuseElementwiseOps().

◆ bufferizeToAllocation() [1/2]

Value mlir::linalg::bufferizeToAllocation ( RewriterBase rewriter,
tensor::PadOp  padOp,
Attribute  memorySpace = {} 
)

Materialize a buffer allocation for the given tensor.pad op and lower the op to linalg.fill/linalg.generic + memref.tensor_store.

E.g.:

%0 = tensor.pad low[l] high[h] t ...

is lowered to:

alloc = memref.alloc linalg.fill ... outs(alloc) subview = memref.subview alloc [l] [...] [1] memref.tensor_store t, subview %0 = bufferization.to_tensor alloc restrict writable

In addition to rewriting the IR as shown above, the result of the bufferization.to_tensor op is returned.

Referenced by bufferizeToAllocation().

◆ bufferizeToAllocation() [2/2]

Value mlir::linalg::bufferizeToAllocation ( RewriterBase rewriter,
Value  value,
Attribute  memorySpace = {} 
)

Materialize a buffer allocation for the given tensor value.

E.g.:

alloc = memref.alloc memref.tensor_store value, alloc %0 = bufferization.to_tensor alloc restrict writable

In case value is a tensor.pad result, the corresponding overload is used internally to produce a better bufferization.

Definition at line 331 of file ConvertToDestinationStyle.cpp.

References bufferizeToAllocation(), mlir::OpBuilder::create(), createAllocationForTensor(), mlir::Value::getDefiningOp(), mlir::Value::getLoc(), mlir::detail::IROperandBase::getOwner(), mlir::Value::getUses(), mlir::IROperand< DerivedT, IRValueT >::set(), mlir::OpBuilder::setInsertionPointAfter(), mlir::OpBuilder::setInsertionPointToStart(), and mlir::RewriterBase::updateRootInPlace().

◆ collapseGenericOpIterationDims()

FailureOr< SmallVector< Value > > mlir::linalg::collapseGenericOpIterationDims ( GenericOp  genericOp,
ArrayRef< ReassociationIndices foldedIterationDims,
RewriterBase rewriter 
)

Collapses dimensions of linalg.generic operation.

Implementation of fusion with reshape operation by collapsing dimensions.

A precondition to calling this method is that for each list in foldedIterationDim, the sequence of dimensions is contiguous in domains of all indexing_maps of the genericOp. This can be checked using areDimSequencePreserved method. When valid, the method also collapses the operands of the op. Returns replacement values of the results of the original genericOp by inserting reshapes to get back values of compatible types.

Definition at line 1428 of file ElementwiseOpFusion.cpp.

References mlir::OpBuilder::create(), mlir::detail::enumerate(), mlir::failed(), mlir::failure(), mlir::Block::front(), generateCollapsedIndexingRegion(), mlir::Block::getArguments(), getCollapsedOpIteratorTypes(), getCollapsedOpOperand(), getOperandReassociation(), mlir::Value::getType(), mlir::getValueOrCreateConstantIndexOp(), mlir::m_ConstantInt(), mlir::matchPattern(), mlir::RewriterBase::mergeBlocks(), mlir::RewriterBase::notifyMatchFailure(), mlir::Range::offset, mlir::OpBuilder::setInsertionPoint(), mlir::Range::size, and mlir::Range::stride.

◆ computeAllSliceParameters()

SmallVector< std::optional< SliceParameters > > mlir::linalg::computeAllSliceParameters ( OpBuilder builder,
Location  loc,
LinalgOp  linalgOp,
ValueRange  valuesToTile,
ArrayRef< OpFoldResult ivs,
ArrayRef< OpFoldResult tileSizes,
ArrayRef< OpFoldResult sizeBounds,
bool  omitPartialTileCheck 
)

Computes SliceParamaters for all valuesToTile of the given linalgOp, assuming linalgOp is being fused into a loop nest.

Calls computeSliceParameters for every individual value.

Note that a constant zero in tileSizes means no tiling at that implicit loop. The number of non-zero values in tileSizes should be equal to the number of values in ivs.

Some of the valuesToTile won't be affected by tiling. For these values, std::nullopt will be returned.

Definition at line 858 of file Utils.cpp.

References computeSliceParameters(), computeTileOffsets(), computeTileSizes(), and isTiled().

Referenced by makeTiledShapes().

◆ computeMultiTileSizes()

FailureOr< MultiSizeSpecification > mlir::linalg::computeMultiTileSizes ( OpBuilder builder,
LinalgOp  op,
unsigned  dimension,
OpFoldResult  targetSize,
OpFoldResult  divisor,
bool  emitAssertions = true 
)

Emits the IR computing the multi-sized tiling specification with two tile sizes not exceeding targetSize, each divisible by sizeDivisor, such that there exist numbers of tiles with these sizes that fully cover the given iteration space dimension of the structured op.

The computation is as follows:

b = originalTripCount floordiv sizeDivisor t = (targetSize + sizeDivisor - 1) floordiv sizeDivisor d = (b + t - 1) floordiv t s = (b floordiv d) * sizeDivisor v = b % d u = d - v

where the tile sizes are s and s + sizeDivisor, and the numbers of the corresponding tiles are u and v, respectively. Alternatively,

s * u + (s + sizeDivisor) * v == original size, where s mod sizeDivisor = 0.

Expects all values to be positive. In some cases with the target tile size sufficiently close to the dimension shape and non-unit divisor, it is impossible to compute such sizes. If emitAssertion is set, also emit the assertion that size computation succeeded.

Returns the specification consisting of both tile values and the number of tiles of each size.

Definition at line 148 of file Tiling.cpp.

◆ computeSliceParameters()

SliceParameters mlir::linalg::computeSliceParameters ( OpBuilder builder,
Location  loc,
Value  valueToTile,
ArrayRef< OpFoldResult tileSizes,
AffineMap  map,
ArrayRef< OpFoldResult lbs,
ArrayRef< OpFoldResult ubs,
ArrayRef< OpFoldResult subShapeSizes,
bool  omitPartialTileCheck 
)

Computes SliceParameters for a single valueToTile assuming that its user is being tiled with the given loop bounds lbs and ubs and the tile sizes tileSizes.

omitPartialTileCheck controls whether to omit the partial/boundary tile condition check in cases where we statically know that it is unnecessary.

Definition at line 683 of file Utils.cpp.

References mlir::bindDims(), createFoldedDimOp(), mlir::getAffineSymbolExpr(), mlir::getConstantIntValue(), mlir::Builder::getContext(), mlir::Builder::getIndexAttr(), mlir::AffineMap::getSubMap(), mlir::Value::getType(), mlir::AffineMap::inferFromExprList(), isTiled(), mlir::affine::makeComposedFoldedAffineApply(), mlir::affine::makeComposedFoldedAffineMin(), mlir::linalg::SliceParameters::offsets, mlir::linalg::SliceParameters::sizes, and mlir::linalg::SliceParameters::strides.

Referenced by computeAllSliceParameters(), and makeTiledShape().

◆ computeStaticMultiTileSizes()

FailureOr< StaticMultiSizeSpecification > mlir::linalg::computeStaticMultiTileSizes ( LinalgOp  op,
unsigned  dimension,
int64_t  targetSize,
int64_t  divisor 
)

Definition at line 122 of file Tiling.cpp.

◆ computeTileOffsets()

SmallVector< OpFoldResult > mlir::linalg::computeTileOffsets ( OpBuilder b,
Location  loc,
ArrayRef< OpFoldResult ivs,
ArrayRef< OpFoldResult tileSizes 
)

Computes tile offsets, given a list of loop ivs and tileSizes.

In case a tile size is zero (i.e., no tiling), the corresponding offset is also zero.

Definition at line 790 of file Utils.cpp.

References mlir::Builder::getIndexAttr(), isTiled(), and mlir::isZeroIndex().

Referenced by computeAllSliceParameters().

◆ computeTileSizes()

SmallVector< OpFoldResult > mlir::linalg::computeTileSizes ( OpBuilder b,
Location  loc,
ArrayRef< OpFoldResult tileSizes,
ArrayRef< OpFoldResult sizeBounds 
)

Computes tile sizes, given a list of tileSizes and dimension sizes (sizeBounds).

In case a tile size is zero (i.e., no tiling), the corresponding result size is the corresponding value from sizeBounds. Note: The returned tile sizes are closed intervals.

Definition at line 804 of file Utils.cpp.

References mlir::getAffineDimExpr(), mlir::Builder::getContext(), isTiled(), mlir::isZeroIndex(), and mlir::affine::makeComposedFoldedAffineApply().

Referenced by computeAllSliceParameters().

◆ concat()

SmallVector< AffineExpr, 4 > mlir::linalg::concat ( ArrayRef< AffineExpr a,
ArrayRef< AffineExpr b 
)

Return the vector that is the concatenation of a and b.

Definition at line 1834 of file LinalgOps.cpp.

Referenced by mlir::presburger::Simplex::makeProduct().

◆ containsMostMinorMatmul()

bool mlir::linalg::containsMostMinorMatmul ( linalg::LinalgOp  linalgOp)

Return true if linalgOp contains an embedded matmul subcomputation in its most minor dimensions.

◆ copyToGPUPrivateMemory()

LogicalResult mlir::linalg::copyToGPUPrivateMemory ( OpBuilder b,
Value  src,
Value  dst 
)

Normal copy to between src and dst.

Definition at line 477 of file Promotion.cpp.

References mlir::OpBuilder::create(), mlir::Value::getLoc(), and mlir::success().

◆ copyToWorkgroupMemory()

LogicalResult mlir::linalg::copyToWorkgroupMemory ( OpBuilder b,
Value  src,
Value  dst 
)

Create Memref copy operations and add gpu barrier guards before and after the copy operation to ensure data integrity.

Definition at line 460 of file Promotion.cpp.

References mlir::OpBuilder::create(), mlir::Value::getLoc(), and mlir::success().

◆ createAdd()

static Value mlir::linalg::createAdd ( Location  loc,
Value  x,
Value  y,
OpBuilder builder 
)
static

Definition at line 31 of file ConvertConv2DToImg2Col.cpp.

References mlir::OpBuilder::create(), and mlir::Value::getType().

Referenced by rewriteInIm2Col().

◆ createDynamicDimensions()

SmallVector< Value > mlir::linalg::createDynamicDimensions ( OpBuilder b,
Location  loc,
Value  val 
)

Build the list of DimOp for the dynamic dimensions of val.

Asserts that val is a ranked shaped type.

Definition at line 61 of file IndexingUtils.cpp.

References createOrFoldDimOp(), mlir::detail::enumerate(), and mlir::Value::getType().

Referenced by getMixedDimensions().

◆ createFoldedDimOp()

OpFoldResult mlir::linalg::createFoldedDimOp ( OpBuilder b,
Location  loc,
Value  val,
int64_t  dim 
)

Create one memref::DimOp or tensor::DimOp depending on the type of val.

This is a polymorphic convenience function to abstract away the rank and concrete type of val. Asserts that val is a memref or tensor type.

Definition at line 53 of file IndexingUtils.cpp.

References createOrFoldDimOp(), mlir::Builder::getIndexAttr(), and mlir::Value::getType().

Referenced by fuse().

◆ createMul()

static Value mlir::linalg::createMul ( Location  loc,
Value  x,
Value  y,
Type  accType,
OpBuilder builder 
)
static

Definition at line 38 of file ConvertConv2DToImg2Col.cpp.

References mlir::convertScalarToDtype(), and mlir::OpBuilder::create().

Referenced by rewriteInIm2Col().

◆ createOrFoldDimOp()

Value mlir::linalg::createOrFoldDimOp ( OpBuilder b,
Location  loc,
Value  val,
int64_t  dim 
)

Create one memref::DimOp or tensor::DimOp depending on the type of val.

This is a polymorphic convenience function to abstract away the rank and concrete type of val. Asserts that val is a memref or tensor type.

Definition at line 45 of file IndexingUtils.cpp.

References mlir::OpBuilder::createOrFold(), and mlir::Value::getType().

Referenced by concatSizesFromInputs(), createDynamicDimensions(), createFoldedDimOp(), createFoldedDimOp(), mlir::sparse_tensor::genDenseTensorOrSparseConstantIterLoop(), mlir::sparse_tensor::LoopEmitter::initializeLoopEmit(), and mlir::sparse_tensor::sizesFromSrc().

◆ deallocateGPUPrivateMemory()

LogicalResult mlir::linalg::deallocateGPUPrivateMemory ( OpBuilder ,
Value   
)

In case of GPU private memory there is no need to deallocate since the memory is freed when going outside of the scope.

Definition at line 485 of file Promotion.cpp.

References mlir::success().

◆ deallocateWorkgroupMemory()

LogicalResult mlir::linalg::deallocateWorkgroupMemory ( OpBuilder ,
Value   
)

In case of GPU group memory there is no need to deallocate.

Definition at line 453 of file Promotion.cpp.

References mlir::success().

◆ extractOrIdentityMap()

AffineMap mlir::linalg::extractOrIdentityMap ( std::optional< AffineMap maybeMap,
unsigned  rank,
MLIRContext context 
)

Returns maybeMap.get() if maybeMap is set, otherwise returns the symbol-less identity map of rank.

Definition at line 1814 of file LinalgOps.cpp.

References mlir::AffineMap::get(), and mlir::AffineMap::getMultiDimIdentityMap().

◆ findPermutationsIndexingOperand()

DenseSet< int64_t > mlir::linalg::findPermutationsIndexingOperand ( LinalgOp  linalgOp,
OpOperand opOperand,
utils::IteratorType  iter 
)

Given a linalgOp and one of its opOperand, returns the positions of the iterators of type iter that index the opOperand as a permutation.

This is useful to infer various subcomputations on a given linalgOp. This is performed by looking up each result in the matching indexing map and determining whether:

Definition at line 147 of file Utils.cpp.

References mlir::detail::IROperandBase::getOwner(), and mlir::AffineMap::getResults().

◆ fuseElementwiseOps()

FailureOr< mlir::linalg::ElementwiseOpFusionResult > mlir::linalg::fuseElementwiseOps ( RewriterBase rewriter,
OpOperand fusedOperand 
)

◆ fuseProducerOfTensor() [1/2]

FailureOr< FusionInfo > mlir::linalg::fuseProducerOfTensor ( OpBuilder b,
OpOperand consumerOpOperand 
)

Tensor counterpart of fuseProducerOfBuffer.

This implements the fusion part of the "tileAndFuse on tensors" transformation and thus requires the consumerOpOperand to be a extract_slice op (generally obtained by applying the tiling transformation).

Definition at line 237 of file Fusion.cpp.

References mlir::failure(), mlir::IROperand< DerivedT, IRValueT >::get(), and getProducerOfTensor().

◆ fuseProducerOfTensor() [2/2]

FailureOr< FusionInfo > mlir::linalg::fuseProducerOfTensor ( OpBuilder b,
OpResult  producerOpResult,
OpOperand consumerOpOperand 
)

Tensor counterpart of fuseProducerOfBuffer.

This implements the fusion part of the "tileAndFuse on tensors" transformation and thus requires the consumerOpOperand to be a extract_slice op (generally obtained by applying the tiling transformation). Assumes producerOfTensor is a Linalg op that produces consumerOpOperand.

Definition at line 249 of file Fusion.cpp.

References mlir::OpBuilder::create(), mlir::failure(), fuse(), mlir::IROperand< DerivedT, IRValueT >::get(), mlir::Value::getDefiningOp(), mlir::detail::IROperandBase::getOwner(), mlir::OpResult::getOwner(), mlir::Value::getParentBlock(), mlir::OpResult::getResultNumber(), mlir::Value::getType(), mlir::IROperand< DerivedT, IRValueT >::set(), and mlir::OpBuilder::setInsertionPoint().

◆ generalizeNamedOp()

FailureOr< GenericOp > mlir::linalg::generalizeNamedOp ( RewriterBase rewriter,
LinalgOp  namedOp 
)

Create a GenericOp from the given named operation namedOp and replace namedOp.

Return failure if namedOp is a GenericOp or misses a region builder.

Definition at line 51 of file Generalization.cpp.

References mlir::OpBuilder::create(), mlir::failed(), generalizeNamedOpPrecondition(), mlir::RewriterBase::inlineRegionBefore(), mlir::RewriterBase::notifyMatchFailure(), and mlir::RewriterBase::replaceOp().

Referenced by packMatmulGreedily().

◆ generateLibraryCallName()

std::string mlir::linalg::generateLibraryCallName ( Operation op)

Returns the name mangled library call name to disambiguate between different overloads at the C level.

The name mangling scheme is basic and uses MLIR type names:

  1. form a string which is the concatenation of the linalg op name with all the operand type names, separate by underscores;
  2. drop the linalg. prefix, and the <, >, ? symbols from the type. Assumes op is a LinalgOp.

Examples:

  1. linalg.fill(f, A) : f32, memref<f32> name mangles into linalg_fill_f32_viewf32
  2. linalg.dot A, B, C : (memref<?xf32, stride_specification>, memref<?xf32, stride_specification>, memref<f32>) name mangles into linalg_dot_viewxf32_viewxf32_viewf32
  3. linalg.matmul(...) : memref<?x?xf32, stride_specification>, memref<?x?xf32, stride_specification>, memref<?x?xf32, stride_specification> name mangles into linalg_matmul_viewxxf32_viewxxf32_viewxxf32

Definition at line 1874 of file LinalgOps.cpp.

◆ generateParallelLoopNest()

static void mlir::linalg::generateParallelLoopNest ( OpBuilder b,
Location  loc,
ValueRange  lbs,
ValueRange  ubs,
ValueRange  steps,
ArrayRef< utils::IteratorType >  iteratorTypes,
ArrayRef< linalg::ProcInfo procInfo,
function_ref< void(OpBuilder &, Location, ValueRange)>  bodyBuilderFn,
SmallVectorImpl< Value > &  ivStorage 
)
static

Generates a loop nest consisting of scf.parallel and scf.for, depending on the iteratorTypes.

Consecutive parallel loops create a single scf.parallel operation; each sequential loop creates a new scf.for operation. The body of the innermost loop is populated by bodyBuilderFn that accepts a range of induction variables for all loops. ivStorage is used to store the partial list of induction variables.

Definition at line 489 of file Utils.cpp.

References mlir::ArithBuilder::_and(), mlir::scf::buildLoopNest(), mlir::OpBuilder::create(), mlir::linalg::ProcInfo::distributionMethod, isParallelIterator(), None, and mlir::ArithBuilder::slt().

Referenced by mlir::linalg::GenerateLoopNest< LoopTy >::doit().

◆ getCombinerOpKind()

std::optional< vector::CombiningKind > mlir::linalg::getCombinerOpKind ( Operation combinerOp)

Return vector::CombiningKind for the given op.

Definition at line 462 of file Vectorization.cpp.

Referenced by buildMultiDimReduce().

◆ getConvolvedIndex()

static Value mlir::linalg::getConvolvedIndex ( OpBuilder b,
Location  loc,
Value  oIndex,
Value  fIndex,
int64_t  stride 
)
static

◆ getLinalgTilingCanonicalizationPatterns()

RewritePatternSet mlir::linalg::getLinalgTilingCanonicalizationPatterns ( MLIRContext ctx)

Canonicalization patterns relevant to apply after tiling patterns.

These are applied automatically by the tiling pass but need to be applied manually when tiling is called programmatically.

Definition at line 874 of file Tiling.cpp.

References populateLinalgTilingCanonicalizationPatterns().

◆ getMixedDimensions()

SmallVector< OpFoldResult > mlir::linalg::getMixedDimensions ( OpBuilder b,
Location  loc,
Value  val 
)

Build the list of all dimensions for val, mixing static attributes and dynamic values where appropriate.

Asserts that val is a ranked shaped type.

Definition at line 74 of file IndexingUtils.cpp.

References createDynamicDimensions(), mlir::getMixedValues(), and mlir::Value::getType().

Referenced by lowerPack(), lowerUnPack(), and vectorizeAsTensorPadOp().

◆ getNeutralElement()

std::optional< TypedAttr > mlir::linalg::getNeutralElement ( Operation op)

Return the identity numeric value associated to the give op.

Return std::nullopt if there is no known neutral element.

Definition at line 988 of file Utils.cpp.

◆ getPrunedAttributeList()

template<typename OpTy >
SmallVector<NamedAttribute> mlir::linalg::getPrunedAttributeList ( OpTy  op)

Returns an attribute list that excludes pre-defined attributes.

Definition at line 410 of file Utils.h.

◆ getReassociationMapForFoldingUnitDims()

std::optional< SmallVector< ReassociationIndices > > mlir::linalg::getReassociationMapForFoldingUnitDims ( ArrayRef< OpFoldResult mixedSizes)

Get the reassociation maps to fold the result of a extract_slice (or source of a insert_slice) operation with given offsets, and sizes to its rank-reduced version.

This is only done for the cases where the size is 1 and offset is 0. Strictly speaking the offset 0 is not required in general, but non-zero offsets are not handled by SPIR-V backend at this point (and potentially cannot be handled).

Definition at line 966 of file Utils.cpp.

References mlir::detail::enumerate().

◆ getTensorOutputTypes()

SmallVector< Type > mlir::linalg::getTensorOutputTypes ( LinalgOp  op,
ValueRange  operands 
)

Returns the list of tensor output types produced when the given structured operation op is applied to the given operands.

Note that operands are not necessarily the actual operands of op.

Definition at line 820 of file Utils.cpp.

◆ hasAllOneValues()

static bool mlir::linalg::hasAllOneValues ( DenseIntElementsAttr  attr)
static

Definition at line 26 of file ConvertConv2DToImg2Col.cpp.

Referenced by rewriteInIm2Col().

◆ hasOnlyScalarElementwiseOp()

bool mlir::linalg::hasOnlyScalarElementwiseOp ( Region r)

Detect whether r has only ConstantOp, ElementwiseMappable and YieldOp.

Definition at line 234 of file Utils.cpp.

◆ hoistPaddingOnTensors() [1/2]

FailureOr< Value > mlir::linalg::hoistPaddingOnTensors ( RewriterBase rewriter,
tensor::PadOp  opToHoist,
int64_t  numLoops,
ArrayRef< int64_t >  transposeVector,
tensor::PadOp &  hoistedOp,
SmallVectorImpl< GenericOp > &  transposeOps 
)

Mechanically hoist padding operations on tensors by numLoops into a new, generally larger tensor.

This achieves packing of multiple padding ops into a larger tensor. On success, opToHoist is replaced by the cloned version in the packing loop so the caller can continue reasoning about the padding operation. If transposeVector is non-empty, hoist padding introduces a GenericOp to transpose the padded tensor before inserting it into the packed tensor. A transposeVector can change the storage order of the padded tensor but does not change the order of the pack or compute loops.

TODO: In the future, we should consider rewriting as a tensor.pack after hoisting since this abstraction is now available.

Example in pseudo-mlir:

If hoistPaddingOnTensors is called with nLoops = 2 on the following IR.

scf.for (%i, %j, %k)
%st0 = tensor.extract_slice f(%i, %k) : ... to tensor<?x?xf32>
%0 = tensor.pad %st0 low[0, 0] high[...] {
^bb0( ... ):
linalg.yield %pad
} : tensor<?x?xf32> to tensor<4x8xf32>
compute(%0)
Eliminates variable at the specified position using Fourier-Motzkin variable elimination.

IR resembling the following is produced:

scf.for (%i) {
%packed_init = tensor.empty range(%j) : tensor<?x4x8xf32>
%packed = scf.for (%k) iter_args(%p : %packed_init) {
%st0 = tensor.extract_slice f(%i, %k) : ... to tensor<?x?xf32>
%0 = tensor.pad %st0 low[0, 0] high[...] {
^bb0( ... ):
linalg.yield %pad
} : tensor<?x?xf32> to tensor<4x8xf32>
%1 = tensor.insert_slice %0 ...
: tensor<4x8xf32> to tensor<?x4x8xf32>
scf.yield %1: tensor<?x4x8xf32>
} -> tensor<?x4x8xf32>
scf.for (%j, %k) {
%st0 = tensor.extract_slice %packed [%k, 0, 0][1, 4, 8][1, 1, 1] :
tensor<?x4x8xf32> to tensor<4x8xf32>
compute(%st0)
}
}

Construct the packing loop nest.

Definition at line 939 of file HoistPadding.cpp.

References buildPackingLoopNestImpl(), mlir::tensor::computeTransposedType(), mlir::OpBuilder::create(), DBGS, mlir::failed(), mlir::failure(), mlir::Value::getDefiningOp(), mlir::Operation::getParentOfType(), makeTransposeOp(), replaceByPackingResult(), mlir::OpBuilder::setInsertionPointAfter(), and mlir::succeeded().

Referenced by hoistPaddingOnTensors(), and padAndHoistLinalgOp().

◆ hoistPaddingOnTensors() [2/2]

FailureOr< Value > mlir::linalg::hoistPaddingOnTensors ( tensor::PadOp  opToHoist,
int64_t  numLoops,
ArrayRef< int64_t >  transposeVector,
tensor::PadOp &  hoistedOp,
SmallVectorImpl< GenericOp > &  transposeOps 
)

Calls into hoistPaddingOnTensors with a local IRRewriter.

Definition at line 1004 of file HoistPadding.cpp.

References hoistPaddingOnTensors().

◆ hoistRedundantSubsetExtractInsert()

scf::ForOp mlir::linalg::hoistRedundantSubsetExtractInsert ( RewriterBase rewriter,
scf::ForOp  forOp 
)

Greedily hoist redundant subset extract/insert operations on tensors outside of forOp.

Greedily hoist redundant subset extract/insert operations on tensors outside forOp.

The logic follows:

  1. Look for a write walking back from the forOp yield.
  2. Check the uses of the matching block argument and look for a matching read (i.e. extract_slice of transfer_read) with matching indices.
  3. In the case of a transfer_write, we can bypass other non-conflicting operations and find more hoisting opportunities.
  4. Hoist the read/write pair and update the tensor SSA links.

Return the unmodified forOp if no hoisting occured. Return a new scf::ForOp if hoisting on tensors occured.

After this transformation the returned scf::ForOp may have unused arguments that can be removed by application of canonicalization patterns.

Example:

IR Resembling:

%0 = scf.for %i = %l to %u step %s iter_args(%a0 = %t0)->(tensor<10xf32>) {
%1 = scf.for %j = %l to %u step %s iter_args(%a6 = %a0)->(tensor<10xf32>) {
%e = tensor.extract_slice %a6[%i][%sz][1]: tensor<10xf32> to tensor<?xf32>
%r = vector.transfer_read %e[%c0], %cst: tensor<?xf32>, vector<4xf32>
%u = "some_use"(%r) : (vector<4xf32>) -> vector<4xf32>
%w = vector.transfer_write %u, %e[%c0] : vector<4xf32>, tensor<?xf32>
%st = tensor.insert_slice %w into %a6[%i][%sz][1]
: tensor<?xf32> into tensor<10xf32>
scf.yield %st: tensor<10xf32>
}
scf.yield %1: tensor<10xf32>
}

Progressively hoists to:

%0 = scf.for %i = %l to %u step %s iter_args(%a0 = %t0) -> (tensor<10xf32>){
%e = tensor.extract_slice %a0[%i][%sz][1]: tensor<10xf32> to tensor<?xf32>
%1:2 = scf.for %j = %l to %u step %s iter_args(%a6 = a0, %a7 = %e)
-> (tensor<10xf32>, tensor<?xf32>) {
%r = vector.transfer_read %a7[%c0], %cst: tensor<?xf32>, vector<4xf32>
%u = "some_use"(%r) : (vector<4xf32>) -> vector<4xf32>
%w = vector.transfer_write %u, %a7[%c0] : vector<4xf32>, tensor<?xf32>
scf.yield %a6, %w: tensor<10xf32>, tensor<?xf32>
}
%st = tensor.insert_slice %1#1 into %1#0[%i][%sz][1]
: tensor<?xf32> into tensor<10xf32>
scf.yield %1: tensor<10xf32>
}

and

%0 = scf.for %i = %l to %u step %s iter_args(%a0 = %t0) -> (tensor<10xf32>){
%e = tensor.extract_slice %a0[%i][%sz][1]: tensor<10xf32> to tensor<?xf32>
%r = vector.transfer_read %a7[%c0], %cst: tensor<?xf32>, vector<4xf32>
%1:3 = scf.for %j = %l to %u step %s iter_args(%a6 = a0, %a7 = %e, %a7 = r)
-> (tensor<10xf32>, tensor<?xf32>, vector<4xf32>) {
%u = "some_use"(%r) : (vector<4xf32>) -> vector<4xf32>
scf.yield %a6, %a7, %u: tensor<10xf32>, tensor<?xf32>, vector<4xf32>
}
%w = vector.transfer_write %1#2, %1#1[%c0] : vector<4xf32>, tensor<?xf32>
%st = tensor.insert_slice %w into %1#0[%i][%sz][1]
: tensor<?xf32> into tensor<10xf32>
scf.yield %1: tensor<10xf32>
}

It can then canonicalize to:

%0 = scf.for %i = %l to %u step %s iter_args(%a0 = %t0) -> (tensor<10xf32>){
%e = tensor.extract_slice %a0[%i][%sz][1]: tensor<10xf32> to tensor<?xf32>
%r = vector.transfer_read %a7[%c0], %cst: tensor<?xf32>, vector<4xf32>
%1 = scf.for %j = %l to %u step %s iter_args(%a7 = r)
-> (tensor<10xf32>, tensor<?xf32>, vector<4xf32>) {
%u = "some_use"(%r) : (vector<4xf32>) -> vector<4xf32>
scf.yield %u: vector<4xf32>
}
%w = vector.transfer_write %1, %e[%c0] : vector<4xf32>, tensor<?xf32>
%st = tensor.insert_slice %w into %a0[%i][%sz][1]
: tensor<?xf32> into tensor<10xf32>
scf.yield %1: tensor<10xf32>
}

Return the unmodified forOp if no hoisting occurred. Return a new scf::ForOp if hoisting on tensors occurred.

Definition at line 462 of file SubsetHoisting.cpp.

References DBGS, mlir::detail::enumerate(), mlir::failed(), findHoistableMatchingExtractSlice(), findHoistableMatchingTransferRead(), getLoopInvariantInsertSliceDefining(), getLoopInvariantTransferWriteDefining(), mlir::Operation::getOpOperand(), mlir::Operation::hasOneUse(), hoistExtractInsertSlice(), hoistTransferReadWrite(), isTensorChunkAccessedByUnknownOp(), and mlir::succeeded().

Referenced by hoistRedundantVectorTransfersOnTensor().

◆ hoistRedundantVectorTransfers()

void mlir::linalg::hoistRedundantVectorTransfers ( func::FuncOp  func)

Hoist vector.transfer_read/vector.transfer_write on buffers pairs out of immediately enclosing scf::ForOp iteratively, if the following conditions are true:

  1. The two ops access the same memref with the same indices.
  2. All operands are invariant under the enclosing scf::ForOp.
  3. No uses of the memref either dominate the transfer_read or are dominated by the transfer_write (i.e. no aliasing between the write and the read across the loop) To improve hoisting opportunities, call the moveLoopInvariantCode helper function on the candidate loop above which to hoist. Hoisting the transfers results in scf::ForOp yielding the value that originally transited through memory.

WARNING: This hoisting does not model parallelism and is generally incorrect when used on distributed loops with memref semantics!

Definition at line 81 of file Hoisting.cpp.

References mlir::WalkResult::advance(), DBGS, mlir::getForwardSlice(), mlir::WalkResult::interrupt(), mlir::vector::isDisjointTransferSet(), mlir::moveLoopInvariantCode(), noAliasingUseInLoop(), mlir::DominanceInfo::properlyDominates(), mlir::affine::replaceForOpWithNewYields(), and mlir::replaceLoopWithNewYields().

◆ hoistRedundantVectorTransfersOnTensor()

void mlir::linalg::hoistRedundantVectorTransfersOnTensor ( func::FuncOp  func)

Call into hoistRedundantSubsetInsertExtract without a RewriterBase.

Definition at line 46 of file Hoisting.cpp.

References hoistRedundantSubsetExtractInsert().

◆ inferMatmulDims()

FailureOr<EmbeddedMatmulDimsCandidates> mlir::linalg::inferMatmulDims ( linalg::LinalgOp  linalgOp)

Find 2 parallel (m and n) and 1 reduction (k) dimension candidates that form a matmul subcomputation within linalgOp.

These dimensions are such that:

  1. The m dimension is involved in an outer-product along LHS (i.e. it is a permutation on RES and LHS and does not appear in RHS).
  2. The n dimension is involved in an outer-product along RHS (i.e. it is a permutation on RES and RHS and does not appear in LHS).
  3. The k dimension appears as a permutation on LHS and RHS.
  4. m, n and k appear only once in any given indexing. This allows detecting that some matmul is embedded within linalgOp with some orthogonal heuristic.

Referenced by packMatmulGreedily().

◆ insertSlicesBack()

SmallVector< Value > mlir::linalg::insertSlicesBack ( OpBuilder builder,
Location  loc,
LinalgOp  op,
ValueRange  operands,
ValueRange  results 
)

Creates insert_slice ops that insert results back into larger tensors they were originally extracted from with extract_slice before being passed as operands to the given structured operation op or its clone.

Note that operands are not necessarily the actual operands of op, the operation serves only as metadata container for operand types and positions.

Definition at line 829 of file Utils.cpp.

◆ interchangeGenericOp()

FailureOr< GenericOp > mlir::linalg::interchangeGenericOp ( RewriterBase rewriter,
GenericOp  genericOp,
ArrayRef< unsigned >  interchangeVector 
)

Interchange the iterator_types and iterator_maps dimensions and adapts the index accesses of op.

This is an in-place transformation controlled by interchangeVector. An empty vector is interpreted as the identity permutation and the transformation returns early.

E.g. the permutation (i,j,k) -> (j,k,i) is expressed with interchangeVector = [1,2,0]. All values in interchangeVector must be integers, in the range 0..op.rank without duplications (i.e. [1,1,2] is an invalid permutation).

Return failure if the permutation is not valid.

Definition at line 50 of file Interchange.cpp.

References mlir::applyPermutationToVector(), mlir::AffineMap::compose(), mlir::failed(), mlir::RewriterBase::finalizeRootUpdate(), mlir::Builder::getAffineMapArrayAttr(), mlir::Builder::getArrayAttr(), mlir::AffineMap::getPermutationMap(), mlir::AffineMap::getSubMap(), interchangeGenericOpPrecondition(), mlir::inversePermutation(), mlir::AffineMap::isEmpty(), mlir::RewriterBase::notifyMatchFailure(), mlir::RewriterBase::replaceOpWithNewOp(), mlir::OpBuilder::setInsertionPoint(), and mlir::RewriterBase::startRootUpdate().

Referenced by packMatmulGreedily().

◆ isaContractionOpInterface()

bool mlir::linalg::isaContractionOpInterface ( LinalgOp  linalgOp)

Checks whether linalgOp conforms to ContractionOpInterface.

Definition at line 145 of file LinalgInterfaces.cpp.

◆ isDimSequencePreserved()

bool mlir::linalg::isDimSequencePreserved ( AffineMap  indexingMap,
ReassociationIndicesRef  dimSequence 
)

Return true if a given sequence of dimensions are contiguous in the range of the specified indexing map.

For a given dimSequence, check if the sequence is conserved in the indexingMap.

indexingMap is expected to be a projected permutation. Non-existence of the sequence returns true as well.

Definition at line 1008 of file ElementwiseOpFusion.cpp.

References mlir::AffineExpr::cast(), mlir::detail::enumerate(), mlir::AffineMap::getNumResults(), mlir::AffineMap::getResult(), mlir::AffineMap::getResults(), and mlir::AffineMap::isProjectedPermutation().

Referenced by areDimSequencesPreserved().

◆ isElementwise()

bool mlir::linalg::isElementwise ( LinalgOp  op)

Check if a LinalgOp is an element-wise operation.

Definition at line 248 of file Utils.cpp.

Referenced by vectorizeLinalgOpPrecondition().

◆ isParallelIterator()

bool mlir::linalg::isParallelIterator ( utils::IteratorType  iteratorType)

Check if iterator type has "parallel" semantics.

Definition at line 263 of file Utils.cpp.

Referenced by generateParallelLoopNest(), getCollapsableIterationSpaceDims(), and isFusableWithReshapeByDimExpansion().

◆ isReductionIterator()

bool mlir::linalg::isReductionIterator ( utils::IteratorType  iteratorType)

Check if iterator type has "reduction" semantics.

Definition at line 267 of file Utils.cpp.

Referenced by getCollapsableIterationSpaceDims(), getDimsToReduce(), mlir::sparse_tensor::CodegenEnv::isAdmissibleTopoOrder(), and topSortOptimal().

◆ linalgOpToAffineLoops()

FailureOr< LinalgLoops > mlir::linalg::linalgOpToAffineLoops ( RewriterBase rewriter,
LinalgOp  linalgOp 
)

Emit a loop nest of affine.for with the proper body for linalgOp.

Emits a loop nest of affine.for with the proper body for linalgOp.

Definition at line 372 of file Loops.cpp.

◆ linalgOpToLoops()

FailureOr< LinalgLoops > mlir::linalg::linalgOpToLoops ( RewriterBase rewriter,
LinalgOp  linalgOp 
)

Emit a loop nest of scf.for with the proper body for linalgOp.

Emits a loop nest of scf.for with the proper body for linalgOp.

Definition at line 377 of file Loops.cpp.

◆ linalgOpToParallelLoops()

FailureOr< LinalgLoops > mlir::linalg::linalgOpToParallelLoops ( RewriterBase rewriter,
LinalgOp  linalgOp 
)

Emit a loop nest of scf.parallel with the proper body for linalgOp.

Emits a loop nest of scf.parallel with the proper body for linalgOp.

Definition at line 384 of file Loops.cpp.

◆ lowerPack()

FailureOr< LowerPackResult > mlir::linalg::lowerPack ( RewriterBase rewriter,
tensor::PackOp  packOp 
)

◆ lowerUnPack()

FailureOr< LowerUnPackOpResult > mlir::linalg::lowerUnPack ( RewriterBase rewriter,
tensor::UnPackOp  unPackOp 
)

◆ makeAffineDimExprs()

SmallVector< AffineExpr, 4 > mlir::linalg::makeAffineDimExprs ( unsigned  num,
unsigned &  startIdx,
MLIRContext context 
)

Returns num AffineDimExpr dimensions at positions [startIdx, startIdx + num) and increments startIdx to startIdx + num.

Definition at line 1825 of file LinalgOps.cpp.

References mlir::getAffineDimExpr().

◆ makeComposedPadHighOp()

Value mlir::linalg::makeComposedPadHighOp ( OpBuilder b,
Location  loc,
RankedTensorType  type,
Value  source,
Value  pad,
bool  nofold 
)

Create a tensor::PadOp that pads source to the size of the statically sized type whose static sizes are assumed to be greater than the dynamic source size.

The padding introduces trailing pad values until the target size is met. If source is defined by one or more LinalgOps that have been padded with the same value and sizes, return their padded result instead of creating a tensor::PadOp.

Example:

%0 = tensor.extract_slice %arg0 [%iv0, %iv1] [%sz0, %sz1]
%1 = tensor.pad %0 low[0, 0] high[...] { tensor.yield %cst }
%2 = linalg.matmul ins(...) outs(%1)
%3 = tensor.extract_slice %2 [0, 0] [%sz0, %sz1]

makeComposedPadHighOp(source=%3, pad=cst) returns %2 makeComposedPadHighOp(source=%3, pad=other_cst) returns %4

%4 = tensor.pad %3 low[0, 0] high[...] { tensor.yield %other_cst }

Definition at line 271 of file Utils.cpp.

References mlir::tensor::createPadHighOp(), mlir::Value::getDefiningOp(), mlir::OpResult::getResultNumber(), mlir::m_Constant(), and mlir::matchPattern().

Referenced by padOperandToSmallestStaticBoundingBox().

◆ makeMemRefCopyOp()

GenericOp mlir::linalg::makeMemRefCopyOp ( OpBuilder b,
Location  loc,
Value  from,
Value  to 
)

Returns GenericOp that copies an n-D memref.

Unlike the current implementation of memref::CopyOp, this op can further tile, lower to loops or vectorize.

Definition at line 368 of file Utils.cpp.

References mlir::OpBuilder::create(), mlir::Builder::getContext(), mlir::AffineMap::getMultiDimIdentityMap(), and mlir::Value::getType().

◆ makeTiledLoopRanges()

std::tuple< SmallVector< Range, 4 >, LoopIndexToRangeIndexMap > mlir::linalg::makeTiledLoopRanges ( RewriterBase b,
Location  loc,
AffineMap  map,
ArrayRef< OpFoldResult allShapeSizes,
ArrayRef< OpFoldResult allTileSizes 
)

◆ makeTiledShape()

Value mlir::linalg::makeTiledShape ( OpBuilder builder,
Location  loc,
Value  valueToTile,
ArrayRef< OpFoldResult tileSizes,
AffineMap  map,
ArrayRef< OpFoldResult lbs,
ArrayRef< OpFoldResult ubs,
ArrayRef< OpFoldResult subShapeSizes,
bool  omitPartialTileCheck 
)

Creates an extract_slice/subview op for a single valueToTile with builder.

This new operation extracts a tile of valueToTile, starting at offsets lbs and with sizes subShapeSizes. omitPartialTileCheck controls whether to omit the partial/boundary tile condition check in cases where we statically know that it is unnecessary.

Definition at line 671 of file Utils.cpp.

References computeSliceParameters(), and materializeTiledShape().

◆ makeTiledShapes()

SmallVector< Value > mlir::linalg::makeTiledShapes ( OpBuilder builder,
Location  loc,
LinalgOp  linalgOp,
ValueRange  valuesToTile,
ArrayRef< OpFoldResult ivs,
ArrayRef< OpFoldResult tileSizes,
ArrayRef< OpFoldResult sizeBounds,
bool  omitPartialTileCheck 
)

Creates extract_slice/subview ops for all valuesToTile of the given linalgOp with builder, assuming linalgOp is being fused into a loop nest for tiling with the given induction variables ivs and tile sizes tileSizes.

sizeBounds are the iteration space bounds for all the implicit loops in linalgOp. omitPartialTileCheck controls whether to omit the partial/boundary tile condition check in cases where we statically know that it is unnecessary.

Note that a constant zero in tileSizes means no tiling at that implicit loop. The number of non-zero values in tileSizes should be equal to the number of values in ivs.

Definition at line 909 of file Utils.cpp.

References computeAllSliceParameters(), and materializeTiledShape().

Referenced by fuse().

◆ makeTransposeOp()

GenericOp mlir::linalg::makeTransposeOp ( OpBuilder b,
Location  loc,
Value  inputTensor,
Value  outputTensor,
ArrayRef< int64_t >  transposeVector 
)

Returns a GenericOp that transposes inputTensor into outputTensor using transposeVector to permute the inputTensor dimensions.

Definition at line 331 of file Utils.cpp.

References mlir::OpBuilder::create(), mlir::Builder::getContext(), mlir::AffineMap::getMultiDimIdentityMap(), mlir::AffineMap::getPermutationMap(), mlir::Value::getType(), mlir::inversePermutation(), mlir::isPermutationVector(), mlir::Region::push_back(), and mlir::OpBuilder::setInsertionPointToEnd().

Referenced by hoistPaddingOnTensors().

◆ materializeTiledShape()

static Value mlir::linalg::materializeTiledShape ( OpBuilder builder,
Location  loc,
Value  valueToTile,
const SliceParameters sliceParams 
)
static

◆ offsetIndices() [1/2]

void mlir::linalg::offsetIndices ( OpBuilder b,
LinalgOp  linalgOp,
ArrayRef< OpFoldResult offests 
)

Add the specified offsets to any linalg.index ops contained in the given linalgOp.

The offsets are provided in the same order as iteration space dimensions. Null offests are assumed to be zero.

Definition at line 930 of file Utils.cpp.

Referenced by fuse().

◆ offsetIndices() [2/2]

void mlir::linalg::offsetIndices ( RewriterBase b,
LinalgOp  linalgOp,
ArrayRef< OpFoldResult offests 
)

◆ pack()

FailureOr< PackResult > mlir::linalg::pack ( RewriterBase rewriter,
linalg::LinalgOp  linalgOp,
ArrayRef< OpFoldResult packedSizes 
)

Implement packing of a single LinalgOp by packedSizes.

Implement packing of a single LinalgOp by performing packing by packedSizes.

There must be one packedSizes entry per linalgOp iterator. Return the packed Linalg op on success, failure otherwise.

Definition at line 736 of file Transforms.cpp.

References mlir::OpBuilder::create(), DBGS, DBGSNL, mlir::failed(), mlir::failure(), mlir::getConstantIntValue(), mlir::getElementTypeOrSelf(), mlir::Operation::getRegion(), mlir::Value::getType(), mlir::ValueRange::getTypes(), mlir::Builder::getZeroAttr(), mlir::RewriterBase::notifyMatchFailure(), packLinalgMetadataOnce(), mlir::RewriterBase::replaceOp(), and mlir::Region::takeBody().

Referenced by packMatmulGreedily().

◆ packTranspose()

FailureOr< PackTransposeResult > mlir::linalg::packTranspose ( RewriterBase rewriter,
tensor::PackOp  packOp,
linalg::LinalgOp  linalgOp,
tensor::UnPackOp  maybeUnPackOp,
ArrayRef< int64_t >  outerPerm,
ArrayRef< int64_t >  innerPerm 
)

Transpose a single PackOp -> LinalgOp -> UnPackOp chain and return the transposed PackOp -> LinalgOp -> UnPackOp chain after replacements.

Return failure if either:

  1. the packOp does not have the linalgOp as its unique use.
  2. the maybeUnPackOp, if specified must be a consumer of the result tied to the unique packOp use.
  3. outerPerm (resp. innerPerm) must be valid permutations of packOp.getOuterDimsPerm (resp. packOp.getInnerDimsPerm) or empty.

Definition at line 917 of file Transforms.cpp.

References mlir::OpOperand::getOperandNumber(), mlir::detail::IROperandBase::getOwner(), mlir::isPermutationVector(), mlir::RewriterBase::notifyMatchFailure(), mlir::RewriterBase::replaceOp(), mlir::OpBuilder::setInsertionPoint(), and transposeOneLinalgOperandAndReplace().

◆ padAndHoistLinalgOp()

FailureOr< LinalgOp > mlir::linalg::padAndHoistLinalgOp ( RewriterBase rewriter,
LinalgOp  linalgOp,
LinalgPaddingOptions  options 
)

◆ peelLoop()

SmallVector< Value > mlir::linalg::peelLoop ( RewriterBase rewriter,
Operation op 
)

Try to peel and canonicalize loop op and return the new result.

Also applies affine_min/max bounds simplification on the fly where relevant.

Definition at line 232 of file Transforms.cpp.

Referenced by peelLoops().

◆ peelLoops()

void mlir::linalg::peelLoops ( RewriterBase rewriter,
ArrayRef< scf::ForOp >  loops 
)

Peel 'loops' and applies affine_min/max bounds simplification on the fly where relevant.

Definition at line 248 of file Transforms.cpp.

References peelLoop().

◆ populateBubbleUpExtractSliceOpPatterns()

void mlir::linalg::populateBubbleUpExtractSliceOpPatterns ( RewritePatternSet patterns)

Patterns that are used to bubble up extract slice op above linalg op.

Definition at line 134 of file BubbleUpExtractSlice.cpp.

References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().

◆ populateCollapseDimensions()

void mlir::linalg::populateCollapseDimensions ( RewritePatternSet patterns,
const GetCollapsableDimensionsFn controlCollapseDimensions 
)

Pattern to collapse dimensions in a linalg.generic op.

This will collapse tensor operands when needed and expand back the result tensors.

Definition at line 1853 of file ElementwiseOpFusion.cpp.

References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().

◆ populateConstantFoldLinalgOperations()

void mlir::linalg::populateConstantFoldLinalgOperations ( RewritePatternSet patterns,
const ControlFusionFn controlFn 
)

Patterns to constant fold Linalg operations.

Definition at line 304 of file ConstantFold.cpp.

References mlir::RewritePatternSet::getContext(), and mlir::RewritePatternSet::insert().

◆ populateConvertConv2DToImg2ColPatterns()

void mlir::linalg::populateConvertConv2DToImg2ColPatterns ( RewritePatternSet patterns)

Populates patterns to transform linalg.conv_2d_xxx operations into linalg.generic (for img2col packing) and linalg.matmul.

See also
rewriteInIm2Col for more details.

Definition at line 536 of file ConvertConv2DToImg2Col.cpp.

References mlir::RewritePatternSet::getContext(), and mlir::RewritePatternSet::insert().

◆ populateConvertToDestinationStylePatterns()

void mlir::linalg::populateConvertToDestinationStylePatterns ( RewritePatternSet patterns)

Populate patterns that convert non-destination-style ops to destination style ops.

Definition at line 378 of file ConvertToDestinationStyle.cpp.

References mlir::RewritePatternSet::add().

◆ populateConvolutionVectorizationPatterns()

void mlir::linalg::populateConvolutionVectorizationPatterns ( RewritePatternSet patterns,
PatternBenefit  benefit = 1 
)

Populate patterns for vectorizing low-D convolution ops.

This is a step in progressive lowering for convolution ops, it assume high-D convolution ops were decomposed previously.

Definition at line 3029 of file Vectorization.cpp.

References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().

◆ populateDataLayoutPropagationPatterns()

void mlir::linalg::populateDataLayoutPropagationPatterns ( RewritePatternSet patterns,
const ControlPropagationFn controlPackUnPackPropagation 
)

Patterns to bubble up or down data layout ops across other operations.

Definition at line 749 of file DataLayoutPropagation.cpp.

References mlir::RewritePatternSet::getContext(), and mlir::RewritePatternSet::insert().

◆ populateDecomposeConvolutionPatterns()

void mlir::linalg::populateDecomposeConvolutionPatterns ( RewritePatternSet patterns,
PatternBenefit  benefit = 1 
)

Linalg decompose convolutions patterns.

Populates patterns to decompose high-D convolution ops into low-D ones. This is a step in progressive lowering for convolution ops, afterwards we can vectorize the low-D convolution ops.

Definition at line 1744 of file Transforms.cpp.

References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().

◆ populateDecomposeLinalgOpsPattern()

void mlir::linalg::populateDecomposeLinalgOpsPattern ( RewritePatternSet patterns,
bool  removeDeadArgsAndResults = true 
)

Populate patterns for splitting a LinalgOp with multiple statements within its payload into multiple GenericOp that have a single statement.

The option removeDeadArgsAndResults adds patterns to remove dead arguments and results from the generated decomposed ops. This is default true since the core decomposition patterns relies on these clean up patterns. It is set to false only for testing purposes.

Definition at line 379 of file DecomposeLinalgOps.cpp.

References mlir::RewritePatternSet::getContext(), mlir::RewritePatternSet::insert(), and populateEraseUnusedOperandsAndResultsPatterns().

◆ populateElementwiseOpsFusionPatterns()

void mlir::linalg::populateElementwiseOpsFusionPatterns ( RewritePatternSet patterns,
const ControlFusionFn controlElementwiseOpFusion 
)

Patterns for fusing linalg operation on tensors.

Pattern to fuse linalg.generic -> linalg.generic operations when both operations are fusable elementwise operations.

Definition at line 1842 of file ElementwiseOpFusion.cpp.

References mlir::RewritePatternSet::add(), mlir::RewritePatternSet::getContext(), and populateEraseUnusedOperandsAndResultsPatterns().

◆ populateElementwiseToLinalgConversionPatterns()

void mlir::linalg::populateElementwiseToLinalgConversionPatterns ( RewritePatternSet patterns)

Populate patterns that convert ElementwiseMappable ops to linalg parallel loops.

Definition at line 120 of file ElementwiseToLinalg.cpp.

References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().

◆ populateEraseUnnecessaryInputsPatterns()

void mlir::linalg::populateEraseUnnecessaryInputsPatterns ( RewritePatternSet patterns)

Patterns to promote inputs to outputs and remove unused inputs of linalg.generic ops.

Definition at line 421 of file EraseUnusedOperandsAndResults.cpp.

References mlir::RewritePatternSet::getContext(), and mlir::RewritePatternSet::insert().

◆ populateEraseUnusedOperandsAndResultsPatterns()

void mlir::linalg::populateEraseUnusedOperandsAndResultsPatterns ( RewritePatternSet patterns)

Pattern to remove dead operands and results of linalg.generic operations.

This is effectively DCE for a linalg op.

Definition at line 414 of file EraseUnusedOperandsAndResults.cpp.

References mlir::RewritePatternSet::getContext(), and mlir::RewritePatternSet::insert().

Referenced by populateDecomposeLinalgOpsPattern(), and populateElementwiseOpsFusionPatterns().

◆ populateExtractOpVectorizationPatterns()

void mlir::linalg::populateExtractOpVectorizationPatterns ( RewritePatternSet patterns,
PatternBenefit  baseBenefit = 1 
)

◆ populateFoldReshapeOpsByCollapsingPatterns()

void mlir::linalg::populateFoldReshapeOpsByCollapsingPatterns ( RewritePatternSet patterns,
const ControlFusionFn controlFoldingReshapes 
)

Patterns to fold an expanding tensor.expand_shape operation with its producer generic operation by collapsing the dimensions of the generic op.

Definition at line 1835 of file ElementwiseOpFusion.cpp.

References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().

◆ populateFoldReshapeOpsByExpansionPatterns()

void mlir::linalg::populateFoldReshapeOpsByExpansionPatterns ( RewritePatternSet patterns,
const ControlFusionFn controlFoldingReshapes 
)

Patterns to fold an expanding (collapsing) tensor_reshape operation with its producer (consumer) generic operation by expanding the dimensionality of the loop in the generic op.

Definition at line 1826 of file ElementwiseOpFusion.cpp.

References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().

◆ populateFoldUnitExtentDimsViaReshapesPatterns()

void mlir::linalg::populateFoldUnitExtentDimsViaReshapesPatterns ( RewritePatternSet patterns)

Patterns to fold unit-extent dimensions in operands/results of linalg ops on tensors via reassociative reshape ops.

Patterns that are used to canonicalize the use of unit-extent dims for broadcasting.

Definition at line 663 of file DropUnitDims.cpp.

References mlir::RewritePatternSet::add(), mlir::RewritePatternSet::getContext(), mlir::tensor::populateFoldTensorEmptyPatterns(), mlir::memref::populateResolveRankedShapedTypeResultDimsPatterns(), and mlir::memref::populateResolveShapedTypeResultDimsPatterns().

◆ populateFoldUnitExtentDimsViaSlicesPatterns()

void mlir::linalg::populateFoldUnitExtentDimsViaSlicesPatterns ( RewritePatternSet patterns)

◆ populateFuseTensorPadWithProducerLinalgOpPatterns()

void mlir::linalg::populateFuseTensorPadWithProducerLinalgOpPatterns ( RewritePatternSet patterns)

Pattern to fuse a tensor.pad operation with the producer of its source, if the producer is a linalg operation with all parallel iterator types.

Definition at line 121 of file FusePadOpWithLinalgProducer.cpp.

References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().

◆ populateInlineConstantOperandsPatterns()

void mlir::linalg::populateInlineConstantOperandsPatterns ( RewritePatternSet patterns)

Patterns that are used to inline constant operands into linalg generic ops.

Definition at line 94 of file InlineScalarOperands.cpp.

References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().

◆ populateLinalgNamedOpConversionPatterns()

void mlir::linalg::populateLinalgNamedOpConversionPatterns ( RewritePatternSet patterns)

Patterns to convert from one named op to another.

These can be seen as canonicalizations of named ops into another named op.

Definition at line 160 of file NamedOpConversions.cpp.

References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().

◆ populateLinalgNamedOpsGeneralizationPatterns()

void mlir::linalg::populateLinalgNamedOpsGeneralizationPatterns ( RewritePatternSet patterns)

Linalg generalization patterns.

Populates patterns with patterns to convert spec-generated named ops to linalg.generic ops.

Definition at line 91 of file Generalization.cpp.

References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().

◆ populateLinalgTilingCanonicalizationPatterns()

void mlir::linalg::populateLinalgTilingCanonicalizationPatterns ( RewritePatternSet patterns)

◆ populateLinalgToStandardConversionPatterns()

void mlir::linalg::populateLinalgToStandardConversionPatterns ( RewritePatternSet patterns)

Populate the given list with patterns that convert from Linalg to Standard.

Definition at line 127 of file LinalgToStandard.cpp.

References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().

◆ populateMoveInitOperandsToInputPattern()

void mlir::linalg::populateMoveInitOperandsToInputPattern ( RewritePatternSet patterns)

A pattern that converts init operands to input operands.

Definition at line 696 of file DropUnitDims.cpp.

References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().

◆ populatePadOpVectorizationPatterns()

void mlir::linalg::populatePadOpVectorizationPatterns ( RewritePatternSet patterns,
PatternBenefit  baseBenefit = 1 
)

Populates patterns with patterns that vectorize tensor.pad.

These patterns are meant to apply in a complementary fashion. Benefits are used to encode a certain ordering of pattern application. To avoid scattering magic constants throughout the code base, the patterns must be added with this function. baseBenefit can be used to offset the benefit of all tensor::PadOp vectorization patterns by a certain value.

Definition at line 2070 of file Vectorization.cpp.

References mlir::RewritePatternSet::add(), mlir::PatternBenefit::getBenefit(), and mlir::RewritePatternSet::getContext().

◆ populateSparseTensorRewriting()

void mlir::linalg::populateSparseTensorRewriting ( RewritePatternSet patterns)

Populate patterns that are only useful in the context of sparse tensors.

◆ populateSplitReductionPattern()

void mlir::linalg::populateSplitReductionPattern ( RewritePatternSet patterns,
const ControlSplitReductionFn controlSplitReductionFn,
bool  useAlloc = false 
)

Patterns to apply splitReduction below.

Definition at line 447 of file SplitReduction.cpp.

References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().

◆ populateSwapExtractSliceWithFillPatterns()

void mlir::linalg::populateSwapExtractSliceWithFillPatterns ( RewritePatternSet patterns)

Adds patterns that waps tensor.extract_slice(linalg.fill(cst, init)) into linalg.fill(cst, tensor.extract_slice(init)).

Definition at line 38 of file SwapExtractSliceWithFillPatterns.cpp.

References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().

◆ promoteSubviewAsNewBuffer()

FailureOr< PromotionInfo > mlir::linalg::promoteSubviewAsNewBuffer ( OpBuilder b,
Location  loc,
memref::SubViewOp  subView,
const AllocBufferCallbackFn allocationFn,
DataLayout layout 
)

◆ promoteSubViews()

FailureOr< LinalgOp > mlir::linalg::promoteSubViews ( OpBuilder b,
LinalgOp  op,
const LinalgPromotionOptions options 
)

Promote the subViews into a new buffer allocated at the insertion point b.

Promotion occurs in 3 steps:

  1. Create a new buffer for a full tile (i.e. not clipped at the boundary).
  2. Take a full view on the buffer.
  3. Take a partial slice of the full view in step 2. and copy into it.

Return the modified linalg op (the modification happens in place) as well as all the copy ops created.

Definition at line 395 of file Promotion.cpp.

References mlir::failed(), mlir::failure(), options, and promoteSubViews().

◆ promoteSubviewsPrecondition()

LogicalResult mlir::linalg::promoteSubviewsPrecondition ( Operation op,
LinalgPromotionOptions  options 
)

Promote memref.subviews feeding linalg-on-buffers operations.

Definition at line 373 of file Promotion.cpp.

◆ registerBufferizableOpInterfaceExternalModels()

void mlir::linalg::registerBufferizableOpInterfaceExternalModels ( DialectRegistry registry)

◆ registerTilingInterfaceExternalModels()

void mlir::linalg::registerTilingInterfaceExternalModels ( DialectRegistry registry)

◆ registerTransformDialectExtension()

void mlir::linalg::registerTransformDialectExtension ( DialectRegistry registry)

Definition at line 56 of file DialectExtension.cpp.

References mlir::DialectRegistry::addExtensions().

Referenced by mlir::registerAllDialects().

◆ registerValueBoundsOpInterfaceExternalModels()

void mlir::linalg::registerValueBoundsOpInterfaceExternalModels ( DialectRegistry registry)

◆ rewriteAsPaddedOp()

FailureOr< SmallVector< Value > > mlir::linalg::rewriteAsPaddedOp ( RewriterBase rewriter,
LinalgOp  opToPad,
ArrayRef< int64_t >  paddingDimensions,
ArrayRef< int64_t >  padToMultipleOf,
ArrayRef< Attribute paddingValues,
ArrayRef< bool >  packPaddings,
LinalgOp &  paddedOp 
)

Pad the iterator dimensions paddingDimensions of all opToPad operands to a static bounding box.

padToMultipleOf indicates that each padding dimension should be padded to the specified multiple. If the derived padding sizes should not be rounded up to any multiple, use "1". Use paddingValues and packPaddings to set padding value and nofold attribute of the created tensor::PadOps, respectively. Update paddedOp to the cloned operation with statically shaped paddingDimensions and return the extracted dynamically shaped results. If padding fails, return failure.

Definition at line 157 of file Transforms.cpp.

References mlir::clone(), mlir::OpBuilder::create(), DBGS, mlir::detail::enumerate(), mlir::failed(), mlir::Builder::getIndexAttr(), mlir::Value::getType(), mlir::ValueRange::getTypes(), mlir::RewriterBase::notifyMatchFailure(), padOperandToSmallestStaticBoundingBox(), mlir::reifyResultShapes(), and mlir::OpBuilder::setInsertionPointAfter().

Referenced by padAndHoistLinalgOp().

◆ rewriteInDestinationPassingStyle() [1/3]

FailureOr< Operation * > mlir::linalg::rewriteInDestinationPassingStyle ( RewriterBase rewriter,
tensor::FromElementsOp  fromElementsOp 
)

Rewrite tensor.from_elements to linalg.generic.

Lower tensor.from_elements to a sequence of chained tensor.insert.

Definition at line 205 of file ConvertToDestinationStyle.cpp.

References mlir::OpBuilder::create(), createInserts(), mlir::Value::getDefiningOp(), mlir::RewriterBase::replaceOp(), and mlir::RewriterBase::replaceOpWithNewOp().

◆ rewriteInDestinationPassingStyle() [2/3]

FailureOr< Operation * > mlir::linalg::rewriteInDestinationPassingStyle ( RewriterBase rewriter,
tensor::GenerateOp  generateOp 
)

◆ rewriteInDestinationPassingStyle() [3/3]

FailureOr< Operation * > mlir::linalg::rewriteInDestinationPassingStyle ( RewriterBase rewriter,
tensor::PadOp  padOp 
)

◆ rewriteInIm2Col() [1/3]

FailureOr< std::pair< Operation *, Operation * > > mlir::linalg::rewriteInIm2Col ( RewriterBase rewriter,
linalg::Conv2DNchwFchwOp  convOp 
)

Similar to rewriteInIm2Col with linalg::Conv2DNhwcHwcfOp except because the channels are to the left of the image shape dimensions, the position of the contraction dimension in the resulting matmul is reversed.

This swaps the LHS and RHS of the matmul when compared with nhwc (i.e. (D, C x Kh x Kw) * (C x Kh x Kw, Ho x Wo))

Definition at line 362 of file ConvertConv2DToImg2Col.cpp.

References mlir::bindDims(), mlir::OpBuilder::create(), createAdd(), createMul(), mlir::AffineMap::get(), mlir::get(), mlir::Builder::getContext(), getConvolvedIndex(), mlir::Value::getLoc(), mlir::AffineMap::getMultiDimIdentityMap(), mlir::Value::getType(), hasAllOneValues(), mlir::RewriterBase::notifyMatchFailure(), mlir::RewriterBase::replaceOp(), and unrollIndex().

◆ rewriteInIm2Col() [2/3]

FailureOr< std::pair< Operation *, Operation * > > mlir::linalg::rewriteInIm2Col ( RewriterBase rewriter,
linalg::Conv2DNhwcHwcfOp  convOp 
)

Convert linalg.conv_2d_nhwc_hwcf into linalg.generic (for img2col packing) and linalg.matmul.

A convolution operation can be written as a matrix-matrix multiplication by unfolding the cross-correlation between input and filter and explicitly copy overlapped sliding window inputs.

Consider 2D input X with single channel input and output and 2x2 filter W: [x(0, 0) , x(0, 1) , ..., x(0, n) ] [x(1, 0) , x(1, 1) , ..., x(1, n) ] [. , . ,. , . ] [w(0, 0), w(0, 1)] [. , . , . , . ] (conv) [w(1, 0), w(1, 1)] [. , . , ., . ] [x(n-1, 0), x(n-1, 1), ..., x(n-1, n-1)]

The packed input data (img2col) is a matrix with |rows| = output spatial size, |columns| = filter spatial size. To compute the output Y(i, j) we need to calculate the dot product between filter window at input X(x, y)) and the filter which will look like the following where r.h.s is the img2col matrix and l.h.s is the flattened filter:

[x(0,0), x(0,1), x(1,0), x(1,1)] [x(0,1), x(1,1), x(0,2), x(1,2)] (matmul) [w(0,0), w(0,1), w(1,0), w(1,1)] [x(0,1), x(1,1), x(0,2), x(1,2)] [ . , . , . , . ]

In general for 2D case with (N, H, W, C) input and (Kh, Kw, C, D) filter and output (N, Ho, Wo, D) the convolution is the following matrix-matrix multiplication (Ho x Wo, Kh x Kw x C) * (Kh x Kw x C, D) for each input in the N input. For the case where N > 1 its a batched matrix-matrix multiplication.

On success, return both the operation that produces the img2col tensor and the final operation of the sequence that replaces the original convolution.

Definition at line 76 of file ConvertConv2DToImg2Col.cpp.

References mlir::bindDims(), mlir::OpBuilder::create(), createAdd(), createMul(), mlir::AffineMap::get(), mlir::get(), mlir::Builder::getContext(), getConvolvedIndex(), mlir::AffineMap::getMultiDimIdentityMap(), mlir::Value::getType(), hasAllOneValues(), mlir::RewriterBase::notifyMatchFailure(), mlir::RewriterBase::replaceOp(), and unrollIndex().

◆ rewriteInIm2Col() [3/3]

FailureOr< std::pair< Operation *, Operation * > > mlir::linalg::rewriteInIm2Col ( RewriterBase rewriter,
linalg::DepthwiseConv2DNhwcHwcOp  convOp 
)

Similar to rewriteInIm2Col with linalg::Conv2DNhwcHwcfOp except there is no reduction among the input channels so each convolution can be a matrix-vector product and by transposing both input filter so channels are outer most the computation is a batched matrix-vector product.

Definition at line 211 of file ConvertConv2DToImg2Col.cpp.

References mlir::bindDims(), mlir::OpBuilder::create(), mlir::AffineMap::get(), mlir::get(), mlir::Builder::getAffineConstantExpr(), mlir::Builder::getAffineDimExpr(), mlir::Builder::getContext(), mlir::AffineMap::getMultiDimIdentityMap(), mlir::Operation::getResult(), mlir::Value::getType(), hasAllOneValues(), mlir::inversePermutation(), mlir::RewriterBase::notifyMatchFailure(), and mlir::RewriterBase::replaceOp().

◆ splitOp()

std::pair< TilingInterface, TilingInterface > mlir::linalg::splitOp ( RewriterBase rewriter,
TilingInterface  op,
unsigned  dimension,
OpFoldResult  splitPoint 
)

Split the given op into two parts along the given iteration space dimension at the specified splitPoint, and return the two parts.

If the second part is statically known to be empty, do not create it and return nullptr instead. Error state is signalled by returning a pair of nullptrs.

For example, the following op:

linalg.matmul ins(%0, %1 : tensor<128x32xf32>, tensor<32x64xf32>) outs(%2 : tensor<128x64xf32>)

split along the first dimension at position 42 will result in:

%3 = tensor.extract_slice %0[0, 0][42, 32][1, 1] %4 = tensor.extract_slice %2[0, 0][42, 64][1, 1] %5 = linalg.matmul ins(%3, %1 : tensor<42x32xf32>, tensor<32x64xf32>) outs(%5 : tensor<42x64xf32>) %6 = tensor.insert_slice %5 into %2[0, 0][42, 64][1, 1]

%7 = tensor.extract_slice %0[42, 0][86, 32][1, 1] %8 = tensor.extract_slice %6[42, 0][86, 64][1, 1] %9 = linalg.matmul ins(%7, %1 : tensor<86x32xf32>, tensor<32x64xf32>) outs(%8 : tensor<86x64xf32>) tensor.insert_slice %5 into %6[42, 0][86, 64][1, 1]

Note that there is no simplification other than constant propagation applied to slice extraction and insertion.

Definition at line 67 of file Split.cpp.

◆ splitReduction()

FailureOr< SplitReductionResult > mlir::linalg::splitReduction ( RewriterBase b,
LinalgOp  op,
const ControlSplitReductionFn controlSplitReductionFn,
bool  useAlloc = false 
)

Definition at line 30 of file SplitReduction.cpp.

◆ splitReductionByScaling()

FailureOr< SplitReductionResult > mlir::linalg::splitReductionByScaling ( RewriterBase b,
LinalgOp  op,
const ControlSplitReductionFn controlSplitReductionFn,
bool  useAlloc = false 
)

Scaling-based implementation of the split reduction transformation.

Core rewrite implementation.

Instead of introducing an ExpandShapeOp, this rewrites a reduction dimension k into k * scale + kk.

Example: ``` %0 = linalg.matmul ins(A, B: tensor<16x256xf32>, tensor<256x32xf32>) outs(C: tensor<16x32xf32>) -> tensor<16x32xf32> ```

Is transformed to:

``` #map0 = affine_map<(d0, d1, d2, d3) -> (d0, d2 * 4 + d3)> #map1 = affine_map<(d0, d1, d2, d3) -> (d2 * 4 + d3, d1)> #map2 = affine_map<(d0, d1, d2, d3) -> (d2, d3)> #map3 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2)> #map4 = affine_map<(d0, d1, d2) -> (d0, d1, d2)> #map5 = affine_map<(d0, d1, d2) -> (d0, d1)> %0 = tensor.empty [16, 32, 64] : tensor<16x32x64xf32> cst = arith.constant 0.000000e+00 : f32 %1 = linalg.fill ins(cst : f32) outs(%0 : tensor<16x32x64xf32>) -> tensor<16x32x64xf32> %2 = tensor.empty [64, 4] : tensor<64x4xi1>

%3 = linalg.generic {indexing_maps = [#map0, #map1, #map2, #map3], iterator_types = ["parallel", "parallel", "parallel", "reduction"]} ins(A, B, %2 : tensor<16x256xf32>, tensor<256x32xf32>, tensor<64x4xi1>) outs(%1 : tensor<16x32x64xf32>) { ^bb0(arg3: f32, arg4: f32, arg5: i1, arg6: f32): %5 = arith.mulf arg3, arg4 : f32 %6 = arith.addf arg6, %5 : f32 linalg.yield %6 : f32 } -> tensor<16x32x64xf32>

%4 = linalg.generic {indexing_maps = [#map4, #map5], iterator_types = ["parallel", "parallel", "reduction"]} */ // ins(%3 : tensor<16x32x64xf32>) /** outs(C : tensor<16x32xf32>) { ^bb0(arg3: f32, arg4: f32): %5 = arith.addf arg3, arg4 : f32 linalg.yield %5 : f32 } -> tensor<16x32xf32>

return %4 : tensor<16x32xf32> ```

Definition at line 241 of file SplitReduction.cpp.

◆ tileLinalgOp()

FailureOr< TiledLinalgOp > mlir::linalg::tileLinalgOp ( RewriterBase b,
LinalgOp  op,
const LinalgTilingOptions options 
)

Definition at line 840 of file Tiling.cpp.

◆ tileReductionUsingForall()

FailureOr< linalg::ForallReductionTilingResult > mlir::linalg::tileReductionUsingForall ( RewriterBase b,
PartialReductionOpInterface  op,
ArrayRef< OpFoldResult numThreads,
ArrayRef< OpFoldResult tileSizes = {},
std::optional< ArrayAttr >  mapping = std::nullopt 
)

Method to tile a reduction to parallel iterations computing partial reductions.

After the loop all the partial reduction are merged into a final reduction. For example for the following sequence

%0 = linalg.generic %in ["parallel", "reduction"]
: tensor<7x9xf32> -> tensor<7xf32>

into:

%0 = linalg.fill ... : tensor<7x4xf32>
%1 = scf.forall (%iv) in (%c4) shared_outs(%arg0 = %0)
-> (tensor<7x4xf32>) {
%2 = tensor.extract_slice %arg3 : tensor<7x4xf32> to tensor<7xf32>
%3 = tensor.extract_slice %in : tensor<7x9xf32> -> tensor<7x?xf32>
%4 = linalg.generic %2, %3 ["parallel", "reduction"]
: tensor<7x?xf32> -> tensor<7xf32>
%5 = tensor.insert_slice %3, %arg0[0, %iv] : tensor<7x4xf32>
}
%6 = linalg.generic %1 ["parallel", "reduction"]
: tensor<7x4xf32> -> tensor<7xf32>

Definition at line 616 of file Tiling.cpp.

◆ tileToForallOp()

FailureOr< ForallTilingResult > mlir::linalg::tileToForallOp ( RewriterBase builder,
TilingInterface  op,
ArrayRef< OpFoldResult numThreads,
std::optional< ArrayAttr >  mapping 
)

Definition at line 432 of file Tiling.cpp.

Referenced by mlir::transform::tileToForallOpImpl().

◆ tileToForallOpUsingTileSizes()

FailureOr< ForallTilingResult > mlir::linalg::tileToForallOpUsingTileSizes ( RewriterBase builder,
TilingInterface  op,
ArrayRef< OpFoldResult tileSizes,
std::optional< ArrayAttr >  mapping 
)

Same as tileToForallOp, but calculate the number of threads required using the given tileSizes.

Definition at line 441 of file Tiling.cpp.

Referenced by mlir::transform::tileToForallOpImpl().

◆ transformIndexOps()

void mlir::linalg::transformIndexOps ( RewriterBase b,
LinalgOp  op,
SmallVectorImpl< Value > &  ivs,
const LoopIndexToRangeIndexMap loopIndexToRangeIndex 
)

All indices returned by IndexOp should be invariant with respect to tiling.

Therefore, if an operation is tiled, we have to transform the indices accordingly, i.e. offset them by the values of the corresponding induction variables that are captured implicitly in the body of the op.

Example. linalg.generic before tiling:

#id_2d = (i, j) -> (i, j) #pointwise_2d_trait = { indexing_maps = [#id_2d, #id_2d], iterator_types = ["parallel", "parallel"] } linalg.generic #pointwise_2d_trait operand, result { ^bb0(operand_in: f32, result_in: f32): i = linalg.index 0 : index j = linalg.index 1 : index <some operations that use i, j> }: memref<50x100xf32>, memref<50x100xf32>

After tiling pass with tiles sizes 10 and 25:

#strided = (i, j)[s0, s1, s2] -> (i * s1 + s0 + j * s2)

c1 = arith.constant 1 : index c0 = arith.constant 0 : index c25 = arith.constant 25 : index c10 = arith.constant 10 : index operand_dim_0 = dim operand, 0 : memref<50x100xf32> operand_dim_1 = dim operand, 1 : memref<50x100xf32> scf.for k = c0 to operand_dim_0 step c10 { scf.for l = c0 to operand_dim_1 step c25 { %4 = memref.subview operand[k, l][c10, c25][c1, c1] : memref<50x100xf32> to memref<?x?xf32, #strided> %5 = memref.subview result[k, l][c10, c25][c1, c1] : memref<50x100xf32> to memref<?x?xf32, #strided> linalg.generic pointwise_2d_trait %4, %5 { ^bb0(operand_in: f32, result_in: f32): i = linalg.index 0 : index j = linalg.index 1 : index // Indices k and l are implicitly captured in the body. transformed_i = arith.addi i, k : index // index i is offset by k transformed_j = arith.addi j, l : index // index j is offset by l // Every use of i, j is replaced with transformed_i, transformed_j <some operations that use transformed_i, transformed_j> }: memref<?x?xf32, #strided>, memref<?x?xf32, #strided> } }

TODO: Investigate whether mixing implicit and explicit indices does not lead to losing information.

Definition at line 89 of file Tiling.cpp.

◆ unrollIndex()

static SmallVector<Value> mlir::linalg::unrollIndex ( OpBuilder b,
Location  loc,
Value  index,
ArrayRef< int64_t >  factors 
)
static

◆ updateBoundsForCyclicDistribution()

void mlir::linalg::updateBoundsForCyclicDistribution ( OpBuilder builder,
Location  loc,
Value  procId,
Value  nprocs,
Value lb,
Value ub,
Value step 
)

Update the lb, ub and step to get per processor lb, ub and step.

Definition at line 468 of file Utils.cpp.

References mlir::bindDims(), mlir::getAffineSymbolExpr(), mlir::Builder::getContext(), and mlir::affine::makeComposedAffineApply().

Referenced by mlir::linalg::GenerateLoopNest< LoopTy >::doit().

◆ vectorize()

LogicalResult mlir::linalg::vectorize ( RewriterBase rewriter,
Operation op,
ArrayRef< int64_t >  inputVectorSizes = {},
bool  vectorizeNDExtract = false,
bool  lastVectorSizeScalable = false 
)

Emit a suitable vector form for an operation.

If provided, inputVectorSizes are used to vectorize this operation. inputVectorSizes must match the rank of the iteration space of the operation and the sizes must be smaller or equal than their counterpart interation space sizes, if static. inputVectorShapes also allows the vectorization of operations with dynamic shapes.

If provided, inputVectorSizes are used to vectorize this operation. inputVectorSizes must match the rank of the iteration space of the operation and the input vector sizes must be greater than or equal to their counterpart iteration space sizes, if static. inputVectorShapes also allows the vectorization of operations with dynamic shapes.

Definition at line 1530 of file Vectorization.cpp.

◆ vectorizeCopy()

LogicalResult mlir::linalg::vectorizeCopy ( RewriterBase builder,
memref::CopyOp  copyOp 
)

◆ vectorizeOpPrecondition()

LogicalResult mlir::linalg::vectorizeOpPrecondition ( Operation op,
ArrayRef< int64_t >  inputVectorSizes = {},
bool  vectorizeNDExtract = false 
)

Return success if the operation can be vectorized.

Definition at line 1495 of file Vectorization.cpp.