MLIR
20.0.0git
|
Namespaces | |
detail | |
Classes | |
class | LinalgOpToLibraryCallRewrite |
struct | ContractionDimensions |
Positions of a Linalg op loops that correspond to different kinds of a contraction dimension. More... | |
struct | ConvolutionDimensions |
Positions of a Linalg op loops that correspond to different kinds of a convolution dimension. More... | |
struct | BufferizeToAllocationOptions |
struct | LinalgTilingOptions |
struct | LinalgTilingAndFusionOptions |
struct | LinalgPaddingOptions |
struct | LinalgPromotionOptions |
struct | SplitReductionOptions |
Split Reduction options. More... | |
struct | ControlDropUnitDims |
Transformation to drop unit-extent dimensions from linalg.generic operations. More... | |
struct | DropUnitDimsResult |
struct | ElementwiseOpFusionResult |
Fuse two linalg.generic operations that have a producer-consumer relationship captured through fusedOperand . More... | |
struct | TiledLinalgOp |
Perform standalone tiling of a single LinalgOp by tileSizes . More... | |
struct | PromotionInfo |
Create a new buffer using the allocationFn provided. More... | |
struct | MultiSizeSpecification |
A description of a multi-size tiling comprising tile sizes and numbers of tiles, expressed as Values which may or may not be constant. More... | |
struct | StaticMultiSizeSpecification |
struct | ContinuousTileSizeSpecification |
struct | StaticContinuousTileSizeSpecification |
struct | ForallReductionTilingResult |
Transformation information returned after reduction tiling. More... | |
struct | SplitReductionResult |
Apply transformation to split the single linalg op reduction into a parallel and reduction dimension. More... | |
struct | CollapseResult |
struct | LowerPackResult |
struct | LowerUnPackOpResult |
struct | PackResult |
Struct to hold the result of a pack call. More... | |
struct | PackTransposeResult |
Struct to hold the result of a packTranspose call. More... | |
struct | BlockPackMatmulOptions |
struct | DownscaleSizeOneWindowed2DConvolution |
Rewrites 2-D convolution ops with size-1 window dimensions into 1-D convolution ops. More... | |
struct | DownscaleDepthwiseConv2DNhwcHwcOp |
Rewrites 2-D depthwise convolution ops with size-1 (w, kw) or (h, kh) dimensions into 1-D depthwise convolution ops. More... | |
struct | DownscaleConv2DOp |
struct | LinalgGeneralizationPattern |
Linalg generalization pattern. More... | |
struct | LinalgSpecializationPattern |
struct | CopyVectorizationPattern |
Vectorization pattern for memref::CopyOp. More... | |
struct | GeneralizePadOpPattern |
Rewrite a tensor::PadOp into a sequence of EmptyOp, FillOp and InsertSliceOp. More... | |
struct | GeneralizeOuterUnitDimsPackOpPattern |
Rewrites a tensor::PackOp into a sequence of tensor.pad + linalg.transpose + tensor.insert_slice ops, where the tensor::PackOp has outer dims being all 1s. More... | |
struct | GeneralizeOuterUnitDimsUnPackOpPattern |
Rewrites a tensor::UnPackOp into a sequence of rank-reduced extract_slice op. More... | |
struct | LinalgCopyVTRForwardingPattern |
Match and rewrite for the pattern: More... | |
struct | LinalgCopyVTWForwardingPattern |
Match and rewrite for the pattern: More... | |
struct | ExtractSliceOfPadTensorSwapPattern |
Rewrite extract_slice(tensor.pad(x)) into tensor.pad(extract_slice(x)). More... | |
struct | SliceParameters |
A struct containg offsets-sizes-strides arguments of the tiled shape. More... | |
struct | FusionInfo |
A struct containing the Linalg producer before and after fusion. More... | |
struct | ProcInfo |
Callback function type used to get processor ID, and number of processors used for distribution for all parallel loops generated. More... | |
struct | LinalgLoopDistributionOptions |
Options that allow distribution of loops generated in Linalg transforms to processors while generating the loops. More... | |
struct | RegionMatcher |
A struct containing common matchers over linalg op's region. More... | |
struct | GenerateLoopNest |
Utility class used to generate nested loops with ranges described by loopRanges and loop type described by the iteratorTypes . More... | |
Typedefs | |
using | TileSizeComputationFunction = std::function< SmallVector< Value, 4 >(OpBuilder &, Operation *)> |
using | AllocBufferCallbackFn = std::function< std::optional< Value >(OpBuilder &b, memref::SubViewOp subView, ArrayRef< Value > boundingSubViewSize, DataLayout &layout)> |
Callback function type used to perform the allocation for the promoted subView . More... | |
using | DeallocBufferCallbackFn = std::function< LogicalResult(OpBuilder &b, Value buffer)> |
Callback function type used to deallocate the buffers used to hold the promoted subview. More... | |
using | CopyCallbackFn = std::function< LogicalResult(OpBuilder &b, Value src, Value dst)> |
Callback function type used to insert copy from original subview to subview of the promoted region for the read operands/subview of promoted region to original subview for the results. More... | |
using | ControlSplitReductionFn = std::function< SplitReductionOptions(LinalgOp op)> |
Function signature to control reduction splitting. More... | |
using | LinalgLoops = SmallVector< Operation *, 4 > |
using | LoopIndexToRangeIndexMap = DenseMap< int, int > |
Creates a number of ranges equal to the number of non-zero in tileSizes . More... | |
using | ControlBlockPackMatmulFn = std::function< std::optional< BlockPackMatmulOptions >(linalg::LinalgOp)> |
Function type which is used to control matmul packing. More... | |
using | OptimizeCopyFn = std::function< LogicalResult(RewriterBase &, tensor::PadOp, Value)> |
using | ControlFusionFn = std::function< bool(OpOperand *fusedOperand)> |
Function type which is used to control when to stop fusion. More... | |
using | ControlPropagationFn = std::function< bool(OpOperand *opOperand)> |
Function type which is used to control propagation of tensor.pack/unpack ops. More... | |
using | GetCollapsableDimensionsFn = std::function< SmallVector< ReassociationIndices >(linalg::LinalgOp)> |
Function type to control generic op dimension collapsing. More... | |
using | ProcInfoCallBackFn = std::function< SmallVector< ProcInfo >(OpBuilder &b, Location loc, ArrayRef< Range > parallelLoopRanges)> |
using | MeshAxis = mesh::MeshAxis |
using | ReductionKind = mesh::ReductionKind |
using | MeshSharding = mesh::MeshSharding |
using | ShardingArray = mesh::ShardingArray |
using | MeshOp = mesh::MeshOp |
Enumerations | |
enum class | LinalgTilingLoopType { Loops = 0 , AffineLoops = 1 , ParallelLoops = 2 } |
The type of loops to be generated during tiling. More... | |
enum class | DistributionMethod { Cyclic = 0 , CyclicNumProcsGeNumIters = 1 , CyclicNumProcsEqNumIters = 2 , None = 3 } |
Scheme used to distribute loops to processors. More... | |
Functions | |
void | populateLinalgToStandardConversionPatterns (RewritePatternSet &patterns) |
Populate the given list with patterns that convert from Linalg to Standard. More... | |
std::string | generateLibraryCallName (Operation *op) |
Returns the name mangled library call name to disambiguate between different overloads at the C level. More... | |
SmallVector< AffineExpr, 4 > | makeAffineDimExprs (unsigned num, unsigned &startIdx, MLIRContext *context) |
Returns num AffineDimExpr dimensions at positions [startIdx, startIdx + num) and increments startIdx to startIdx + num . More... | |
AffineMap | extractOrIdentityMap (std::optional< AffineMap > maybeMap, unsigned rank, MLIRContext *context) |
Returns maybeMap.get() if maybeMap is set, otherwise returns the symbol-less identity map of rank . More... | |
SmallVector< AffineExpr, 4 > | concat (ArrayRef< AffineExpr > a, ArrayRef< AffineExpr > b) |
Return the vector that is the concatenation of a and b . More... | |
Value | createOrFoldDimOp (OpBuilder &b, Location loc, Value val, int64_t dim) |
Create one memref::DimOp or tensor::DimOp depending on the type of val . More... | |
OpFoldResult | createFoldedDimOp (OpBuilder &b, Location loc, Value val, int64_t dim) |
Create one memref::DimOp or tensor::DimOp depending on the type of val . More... | |
FailureOr< ContractionDimensions > | inferContractionDims (LinalgOp linalgOp) |
Find at least 2 parallel (m and n) and 1 reduction (k) dimension candidates that form a matmul subcomputation within linalgOp . More... | |
FailureOr< ContractionDimensions > | inferContractionDims (ArrayRef< AffineMap > indexingMaps) |
bool | isaContractionOpInterface (LinalgOp linalgOp) |
Checks whether linalgOp conforms to ContractionOpInterface. More... | |
FailureOr< ConvolutionDimensions > | inferConvolutionDims (LinalgOp linalgOp) |
Find at least 1 parallel (output_image) and reduction (filter_loop) dimension candidates that form a convolution subcomputation within linalgOp . More... | |
bool | isaConvolutionOpInterface (LinalgOp linalgOp, bool allowEmptyConvolvedDims=false) |
Checks whether linalgOp conforms to ConvolutionOpInterface. More... | |
bool | isaCopyOpInterface (LinalgOp linalgOp) |
Checks whether linalgOp is semantically equivalent to a linalg.copyOp . More... | |
bool | isaElemwiseSingleUnaryOpInterface (GenericOp genericOp) |
Checks whether a given genericOp is semantically equivalent to a single linalgelementwise unary op. More... | |
bool | isaElemwiseSingleBinaryOpInterface (GenericOp genericOp) |
Checks whether genericOp is semantically equivalent to a single linalg elementwise binary op e.g. More... | |
std::optional< Value > | isaFillOpInterface (GenericOp genericOp) |
Checks whether genericOp is semantically equivalent to a linalg.fill . More... | |
void | registerValueBoundsOpInterfaceExternalModels (DialectRegistry ®istry) |
void | registerTransformDialectExtension (DialectRegistry ®istry) |
void | registerAllDialectInterfaceImplementations (DialectRegistry ®istry) |
void | registerBufferizableOpInterfaceExternalModels (DialectRegistry ®istry) |
void | hoistRedundantVectorTransfers (Operation *root) |
Hoist vector.transfer_read/vector.transfer_write on buffers pairs out of immediately enclosing scf::ForOp iteratively, if the following conditions are true: More... | |
void | hoistRedundantVectorBroadcasts (RewriterBase &rewriter, Operation *root) |
Hoist vector.extract/vector.broadcast pairs out of immediately enclosing scf::ForOp iteratively, if the following conditions are met: More... | |
void | registerMeshShardingInterfaceExternalModels (DialectRegistry ®istry) |
void | registerRuntimeVerifiableOpInterfaceExternalModels (DialectRegistry ®istry) |
void | registerSubsetOpInterfaceExternalModels (DialectRegistry ®istry) |
void | registerTilingInterfaceExternalModels (DialectRegistry ®istry) |
std::optional< vector::CombiningKind > | getCombinerOpKind (Operation *combinerOp) |
Return vector::CombiningKind for the given op. More... | |
Value | bufferizeToAllocation (RewriterBase &rewriter, const BufferizeToAllocationOptions &options, tensor::PadOp padOp, Attribute memorySpace={}, Operation *insertionPoint=nullptr) |
Materialize a buffer allocation for the given tensor.pad op and lower the op to linalg.fill/linalg.generic + bufferization.materialize_in_destination. More... | |
Value | bufferizeToAllocation (RewriterBase &rewriter, const BufferizeToAllocationOptions &options, vector::MaskOp maskOp, Attribute memorySpace={}, Operation *insertionPoint=nullptr) |
Materialize a buffer allocation for the given vector.mask op and bufferize the op, including its region. More... | |
Value | bufferizeToAllocation (RewriterBase &rewriter, const BufferizeToAllocationOptions &options, bufferization::AllocTensorOp allocTensorOp, Attribute memorySpace={}, Operation *insertionPoint=nullptr) |
Materialize a buffer allocation for the given bufferization.alloc_tensor op and lower the op to memref.alloc + memref.tensor_store. More... | |
Value | bufferizeToAllocation (RewriterBase &rewriter, const BufferizeToAllocationOptions &options, Operation *op, Attribute memorySpace={}, Operation *insertionPoint=nullptr) |
Bufferize the given op with tensor semantics and materialize the result in a newly allocated buffer. More... | |
LogicalResult | linalgOpAnchoredEmptyTensorEliminationStep (RewriterBase &rewriter, Operation *op, bufferization::OneShotAnalysisState &state) |
Try to eliminate tensor::EmptyOps inside op that are anchored on a LinalgOp. More... | |
bool | areElementwiseOpsFusable (OpOperand *fusedOperand) |
Return true if two linalg.generic operations with producer/consumer relationship through fusedOperand can be fused using elementwise op fusion. More... | |
LogicalResult | promoteSubviewsPrecondition (Operation *op, LinalgPromotionOptions options) |
Promote memref.subviews feeding linalg-on-buffers operations. More... | |
LogicalResult | vectorizeOpPrecondition (Operation *op, ArrayRef< int64_t > inputVectorSizes={}, ArrayRef< bool > inputScalableVecDims={}, bool vectorizeNDExtract=false, bool flatten1DDepthwiseConv=false) |
Return success if the operation can be vectorized. More... | |
FailureOr< DropUnitDimsResult > | dropUnitDims (RewriterBase &rewriter, GenericOp genericOp, const ControlDropUnitDims &options) |
FailureOr< ElementwiseOpFusionResult > | fuseElementwiseOps (RewriterBase &rewriter, OpOperand *fusedOperand) |
llvm::SmallDenseSet< int > | getPreservedProducerResults (GenericOp producer, GenericOp consumer, OpOperand *fusedOperand) |
Returns a set of indices of the producer's results which would be preserved after the fusion. More... | |
SmallVector< Value > | peelLoop (RewriterBase &rewriter, Operation *op) |
Try to peel and canonicalize loop op and return the new result. More... | |
void | peelLoops (RewriterBase &rewriter, ArrayRef< scf::ForOp > loops) |
Peel 'loops' and applies affine_min/max bounds simplification on the fly where relevant. More... | |
LogicalResult | rewriteAsPaddedOp (RewriterBase &rewriter, LinalgOp opToPad, const LinalgPaddingOptions &options, LinalgOp &paddedOp, SmallVector< Value > &replacements, SmallVector< tensor::PadOp > &padOps) |
Pad the iterator dimensions paddingDimensions of all opToPad operands to a static bounding box. More... | |
FailureOr< Value > | hoistPaddingOnTensors (RewriterBase &rewriter, tensor::PadOp opToHoist, int64_t numLoops, ArrayRef< int64_t > transposeVector, tensor::PadOp &hoistedOp, SmallVectorImpl< GenericOp > &transposeOps) |
Mechanically hoist padding operations on tensors by numLoops into a new, generally larger tensor. More... | |
FailureOr< Value > | hoistPaddingOnTensors (tensor::PadOp opToHoist, int64_t numLoops, ArrayRef< int64_t > transposeVector, tensor::PadOp &hoistedOp, SmallVectorImpl< GenericOp > &transposeOps) |
Calls into hoistPaddingOnTensors with a local IRRewriter. More... | |
FailureOr< LinalgOp > | padAndHoistLinalgOp (RewriterBase &rewriter, LinalgOp linalgOp, const LinalgPaddingOptions &options) |
Apply padding and hoisting to linalgOp according to the configuration specified in options . More... | |
std::pair< TilingInterface, TilingInterface > | splitOp (RewriterBase &rewriter, TilingInterface op, unsigned dimension, OpFoldResult splitPoint) |
Split the given op into two parts along the given iteration space dimension at the specified splitPoint , and return the two parts. More... | |
FailureOr< TiledLinalgOp > | tileLinalgOp (RewriterBase &b, LinalgOp op, const LinalgTilingOptions &options) |
FailureOr< GenericOp > | interchangeGenericOp (RewriterBase &rewriter, GenericOp genericOp, ArrayRef< unsigned > interchangeVector) |
Interchange the iterator_types and iterator_maps dimensions and adapts the index accesses of op . More... | |
FailureOr< GenericOp > | generalizeNamedOp (RewriterBase &rewriter, LinalgOp namedOp) |
Create a GenericOp from the given named operation namedOp and replace namedOp. More... | |
FailureOr< LinalgOp > | specializeGenericOp (RewriterBase &rewriter, GenericOp genericOp) |
Create a namedOp from the given GenericOp and replace the GenericOp. More... | |
FailureOr< PromotionInfo > | promoteSubviewAsNewBuffer (OpBuilder &b, Location loc, memref::SubViewOp subView, const AllocBufferCallbackFn &allocationFn, DataLayout &layout) |
FailureOr< LinalgOp > | promoteSubViews (OpBuilder &b, LinalgOp op, const LinalgPromotionOptions &options) |
Promote the subViews into a new buffer allocated at the insertion point b . More... | |
std::optional< Value > | allocateWorkgroupMemory (OpBuilder &builder, memref::SubViewOp subview, ArrayRef< Value > sizeBounds, DataLayout &) |
Allocate the subview in the GPU workgroup memory. More... | |
LogicalResult | deallocateWorkgroupMemory (OpBuilder &, Value) |
In case of GPU group memory there is no need to deallocate. More... | |
LogicalResult | copyToWorkgroupMemory (OpBuilder &b, Value src, Value dst) |
Create Memref copy operations and add gpu barrier guards before and after the copy operation to ensure data integrity. More... | |
std::optional< Value > | allocateGPUPrivateMemory (OpBuilder &builder, memref::SubViewOp subview, ArrayRef< Value > sizeBounds, DataLayout &) |
Allocate the subview in the GPU private memory. More... | |
LogicalResult | copyToGPUPrivateMemory (OpBuilder &b, Value src, Value dst) |
Normal copy to between src and dst. More... | |
LogicalResult | deallocateGPUPrivateMemory (OpBuilder &, Value) |
In case of GPU private memory there is no need to deallocate since the memory is freed when going outside of the scope. More... | |
LogicalResult | vectorize (RewriterBase &rewriter, Operation *op, ArrayRef< int64_t > inputVectorSizes={}, ArrayRef< bool > inputScalableVecDims={}, bool vectorizeNDExtract=false, bool flatten1DDepthwiseConv=false) |
Emit a suitable vector form for an operation. More... | |
LogicalResult | vectorizeCopy (RewriterBase &builder, memref::CopyOp copyOp) |
Emit a suitable vector form for a Copy op with fully static shape. More... | |
FailureOr< LinalgLoops > | linalgOpToLoops (RewriterBase &rewriter, LinalgOp linalgOp) |
Emit a loop nest of scf.for with the proper body for linalgOp . More... | |
FailureOr< LinalgLoops > | linalgOpToParallelLoops (RewriterBase &rewriter, LinalgOp linalgOp) |
Emit a loop nest of scf.parallel with the proper body for linalgOp . More... | |
FailureOr< LinalgLoops > | linalgOpToAffineLoops (RewriterBase &rewriter, LinalgOp linalgOp) |
Emit a loop nest of affine.for with the proper body for linalgOp . More... | |
std::tuple< SmallVector< Range, 4 >, LoopIndexToRangeIndexMap > | makeTiledLoopRanges (RewriterBase &b, Location loc, AffineMap map, ArrayRef< OpFoldResult > allShapeSizes, ArrayRef< OpFoldResult > allTileSizes) |
FailureOr< MultiSizeSpecification > | computeMultiTileSizes (OpBuilder &builder, LinalgOp op, unsigned dimension, OpFoldResult targetSize, OpFoldResult divisor, bool emitAssertions=true) |
Emits the IR computing the multi-sized tiling specification with two tile sizes not exceeding targetSize , each divisible by sizeDivisor , such that there exist numbers of tiles with these sizes that fully cover the given iteration space dimension of the structured op . More... | |
FailureOr< StaticMultiSizeSpecification > | computeStaticMultiTileSizes (LinalgOp op, unsigned dimension, int64_t targetSize, int64_t divisor) |
FailureOr< StaticContinuousTileSizeSpecification > | computeStaticContinuousTileSizes (LinalgOp op, unsigned dimension, unsigned targetSize) |
FailureOr< ContinuousTileSizeSpecification > | computeContinuousTileSizes (OpBuilder &builder, TilingInterface op, unsigned dimension, OpFoldResult targetSize, bool emitAssertions) |
FailureOr< ForallReductionTilingResult > | tileReductionUsingForall (RewriterBase &b, PartialReductionOpInterface op, ArrayRef< OpFoldResult > numThreads, ArrayRef< OpFoldResult > tileSizes={}, std::optional< ArrayAttr > mapping=std::nullopt) |
Method to tile a reduction to parallel iterations computing partial reductions. More... | |
void | transformIndexOps (RewriterBase &b, LinalgOp op, SmallVectorImpl< Value > &ivs, const LoopIndexToRangeIndexMap &loopIndexToRangeIndex) |
All indices returned by IndexOp should be invariant with respect to tiling. More... | |
FailureOr< SplitReductionResult > | splitReduction (RewriterBase &b, LinalgOp op, const ControlSplitReductionFn &controlSplitReductionFn, bool useAlloc=false) |
FailureOr< SplitReductionResult > | splitReductionByScaling (RewriterBase &b, LinalgOp op, const ControlSplitReductionFn &controlSplitReductionFn, bool useAlloc=false) |
Scaling-based implementation of the split reduction transformation. More... | |
bool | isDimSequencePreserved (AffineMap map, ReassociationIndicesRef dimSequence) |
Return true if a given sequence of dimensions are contiguous in the range of the specified indexing map. More... | |
bool | areDimSequencesPreserved (ArrayRef< AffineMap > maps, ArrayRef< ReassociationIndices > dimSequences) |
Return true if all sequences of dimensions specified in dimSequences are contiguous in all the ranges of the maps . More... | |
FailureOr< CollapseResult > | collapseOpIterationDims (LinalgOp op, ArrayRef< ReassociationIndices > foldedIterationDims, RewriterBase &rewriter) |
Collapses dimensions of linalg.generic/linalg.copy operation. More... | |
FailureOr< LowerPackResult > | lowerPack (RewriterBase &rewriter, tensor::PackOp packOp) |
Rewrite pack as pad + reshape + transpose. More... | |
FailureOr< LowerUnPackOpResult > | lowerUnPack (RewriterBase &rewriter, tensor::UnPackOp unPackOp) |
Rewrite pack as empty + transpose + reshape + extract_slice. More... | |
FailureOr< PackResult > | pack (RewriterBase &rewriter, linalg::LinalgOp linalgOp, ArrayRef< OpFoldResult > packedSizes) |
Implement packing of a single LinalgOp by packedSizes . More... | |
FailureOr< PackTransposeResult > | packTranspose (RewriterBase &rewriter, tensor::PackOp packOp, linalg::LinalgOp linalgOp, tensor::UnPackOp maybeUnPackOp, ArrayRef< int64_t > outerPerm, ArrayRef< int64_t > innerPerm) |
Transpose a single PackOp -> LinalgOp -> UnPackOp chain and return the transposed PackOp -> LinalgOp -> UnPackOp chain after replacements. More... | |
FailureOr< PackResult > | packMatmulGreedily (RewriterBase &rewriter, LinalgOp linalgOp, ArrayRef< OpFoldResult > mnkPackedSizes, ArrayRef< int64_t > mnkPaddedSizesNextMultipleOf, ArrayRef< int64_t > mnkOrder) |
Pack a LinalgOp by greedily inferring matmul dimensions (m, n, k) where m and n are proper parallel dimensions and k is a proper reduction dimension. More... | |
FailureOr< PackResult > | blockPackMatmul (RewriterBase &rewriter, linalg::LinalgOp linalgOp, const ControlBlockPackMatmulFn &controlPackMatmul) |
Pack a matmul operation into blocked 4D layout. More... | |
FailureOr< Operation * > | rewriteInDestinationPassingStyle (RewriterBase &rewriter, tensor::FromElementsOp fromElementsOp) |
Rewrite tensor.from_elements to linalg.generic. More... | |
FailureOr< Operation * > | rewriteInDestinationPassingStyle (RewriterBase &rewriter, tensor::GenerateOp generateOp) |
Rewrite tensor.generate to linalg.generic. More... | |
FailureOr< Operation * > | rewriteInDestinationPassingStyle (RewriterBase &rewriter, tensor::PadOp padOp) |
Rewrite tensor.pad to linalg.generic + tensor.insert_slice. More... | |
FailureOr< std::pair< Operation *, Operation * > > | rewriteInIm2Col (RewriterBase &rewriter, linalg::Conv2DNhwcHwcfOp convOp) |
Convert linalg.conv_2d_nhwc_hwcf into linalg.generic (for img2col packing) and linalg.matmul. More... | |
FailureOr< std::pair< Operation *, Operation * > > | rewriteInIm2Col (RewriterBase &rewriter, linalg::Conv2DNhwcFhwcOp convOp) |
Same as the above but for Fhwc channel orderings in the filter. More... | |
FailureOr< std::pair< Operation *, Operation * > > | rewriteInIm2Col (RewriterBase &rewriter, linalg::DepthwiseConv2DNhwcHwcOp convOp) |
Similar to rewriteInIm2Col with linalg::Conv2DNhwcHwcfOp except there is no reduction among the input channels so each convolution can be a matrix-vector product and by transposing both input filter so channels are outer most the computation is a batched matrix-vector product. More... | |
FailureOr< std::pair< Operation *, Operation * > > | rewriteInIm2Col (RewriterBase &rewriter, linalg::Conv2DNchwFchwOp convOp) |
Similar to rewriteInIm2Col with linalg::Conv2DNhwcHwcfOp except because the channels are to the left of the image shape dimensions, the position of the contraction dimension in the resulting matmul is reversed. More... | |
FailureOr< Operation * > | transposeConv2D (RewriterBase &rewriter, linalg::Conv2DNhwcFhwcOp op) |
Convert linalg.conv_2d_nhwc_fhwc(_q) to linalg.conv_2d_nhwc_hwcf(_q) by materializing transpose. More... | |
FailureOr< Operation * > | transposeConv2D (RewriterBase &rewriter, linalg::Conv2DNhwcFhwcQOp op) |
FailureOr< Operation * > | transposeMatmul (RewriterBase &rewriter, linalg::MatmulOp op, bool transposeLHS=true) |
Convert Linalg matmul ops to transposed variants. More... | |
FailureOr< Operation * > | transposeBatchMatmul (RewriterBase &rewriter, linalg::BatchMatmulOp op, bool transposeLHS=true) |
Pattern to replace. More... | |
FailureOr< Operation * > | winogradConv2D (RewriterBase &rewriter, linalg::Conv2DNhwcFhwcOp op, int64_t m, int64_t r) |
Convert linalg.conv_2d_nhwc_fhwc to Winograd Conv2D algorithm F(m x m, r x r). More... | |
FailureOr< Operation * > | decomposeWinogradFilterTransformOp (RewriterBase &rewriter, linalg::WinogradFilterTransformOp op) |
Rewrite linalg.winograd_filter_transform. More... | |
FailureOr< Operation * > | decomposeWinogradInputTransformOp (RewriterBase &rewriter, linalg::WinogradInputTransformOp op) |
Rewrite linalg.winograd_input_transform. More... | |
FailureOr< Operation * > | decomposeWinogradOutputTransformOp (RewriterBase &rewriter, linalg::WinogradOutputTransformOp op) |
Rewrite linalg.winograd_output_transform. More... | |
RewritePatternSet | getLinalgTilingCanonicalizationPatterns (MLIRContext *ctx) |
Canonicalization patterns relevant to apply after tiling patterns. More... | |
void | populateLinalgTilingCanonicalizationPatterns (RewritePatternSet &patterns) |
void | populateLinalgNamedOpsGeneralizationPatterns (RewritePatternSet &patterns) |
Linalg generalization patterns. More... | |
void | populateLinalgGenericOpsSpecializationPatterns (RewritePatternSet &patterns) |
Populates patterns with patterns to convert linalg.generic ops to named ops where possible. More... | |
void | populateDecomposeConvolutionPatterns (RewritePatternSet &patterns, PatternBenefit benefit=1) |
Linalg decompose convolutions patterns. More... | |
void | populateConvertConv2DToImg2ColPatterns (RewritePatternSet &patterns) |
Populates patterns to transform linalg.conv_2d_xxx operations into linalg.generic (for img2col packing) and linalg.matmul. More... | |
void | populatePadOpVectorizationPatterns (RewritePatternSet &patterns, PatternBenefit baseBenefit=1) |
Populates patterns with patterns that vectorize tensor.pad. More... | |
void | populateDecomposeLinalgOpsPattern (RewritePatternSet &patterns, bool removeDeadArgsAndResults=true) |
Populate patterns for splitting a LinalgOp with multiple statements within its payload into multiple GenericOp that have a single statement. More... | |
void | populateConvertToDestinationStylePatterns (RewritePatternSet &patterns) |
Populate patterns that convert non-destination-style ops to destination style ops. More... | |
void | populateConvolutionVectorizationPatterns (RewritePatternSet &patterns, PatternBenefit benefit=1) |
Populate patterns for vectorizing low-D convolution ops. More... | |
void | populateElementwiseToLinalgConversionPatterns (RewritePatternSet &patterns) |
Populate patterns that convert ElementwiseMappable ops to linalg parallel loops. More... | |
void | populateSparseTensorRewriting (RewritePatternSet &patterns) |
Populate patterns that are only useful in the context of sparse tensors. More... | |
void | populateElementwiseOpsFusionPatterns (RewritePatternSet &patterns, const ControlFusionFn &controlElementwiseOpFusion) |
Patterns for fusing linalg operation on tensors. More... | |
void | populateDataLayoutPropagationPatterns (RewritePatternSet &patterns, const ControlPropagationFn &controlPackUnPackPropagation) |
Patterns to bubble up or down data layout ops across other operations. More... | |
void | populateEraseUnusedOperandsAndResultsPatterns (RewritePatternSet &patterns) |
Pattern to remove dead operands and results of linalg.generic operations. More... | |
void | populateEraseUnnecessaryInputsPatterns (RewritePatternSet &patterns) |
Patterns to promote inputs to outputs and remove unused inputs of linalg.generic ops. More... | |
void | populateCollapseDimensions (RewritePatternSet &patterns, const GetCollapsableDimensionsFn &controlCollapseDimensions) |
Pattern to collapse dimensions in a linalg.generic op. More... | |
void | populateFoldReshapeOpsByExpansionPatterns (RewritePatternSet &patterns, const ControlFusionFn &controlFoldingReshapes) |
Patterns to fold an expanding (collapsing) tensor_reshape operation with its producer (consumer) generic operation by expanding the dimensionality of the loop in the generic op. More... | |
void | populateFoldReshapeOpsByCollapsingPatterns (RewritePatternSet &patterns, const ControlFusionFn &controlFoldingReshapes) |
Patterns to fold an expanding tensor.expand_shape operation with its producer generic operation by collapsing the dimensions of the generic op. More... | |
void | populateConstantFoldLinalgOperations (RewritePatternSet &patterns, const ControlFusionFn &controlFn) |
Patterns to constant fold Linalg operations. More... | |
void | populateFuseTensorPadWithProducerLinalgOpPatterns (RewritePatternSet &patterns) |
Pattern to fuse a tensor.pad operation with the producer of its source, if the producer is a linalg operation with all parallel iterator types. More... | |
void | populateLinalgNamedOpConversionPatterns (RewritePatternSet &patterns) |
Patterns to convert from one named op to another. More... | |
void | populateFoldUnitExtentDimsPatterns (RewritePatternSet &patterns, ControlDropUnitDims &options) |
Patterns to fold unit-extent dimensions in operands/results of linalg ops on tensors via reassociative reshape ops. More... | |
void | populateMoveInitOperandsToInputPattern (RewritePatternSet &patterns) |
A pattern that converts init operands to input operands. More... | |
void | populateInlineConstantOperandsPatterns (RewritePatternSet &patterns) |
Patterns that are used to inline constant operands into linalg generic ops. More... | |
void | populateBubbleUpExtractSliceOpPatterns (RewritePatternSet &patterns) |
Patterns that are used to bubble up extract slice op above linalg op. More... | |
void | populateSwapExtractSliceWithFillPatterns (RewritePatternSet &patterns) |
Adds patterns that waps tensor.extract_slice(linalg.fill(cst, init)) into linalg.fill(cst, tensor.extract_slice(init)). More... | |
void | populateSplitReductionPattern (RewritePatternSet &patterns, const ControlSplitReductionFn &controlSplitReductionFn, bool useAlloc=false) |
Patterns to apply splitReduction below. More... | |
void | populateTransposeMatmulPatterns (RewritePatternSet &patterns, bool transposeLHS=true) |
Patterns to convert Linalg matmul ops to transposed variants. More... | |
void | populateBlockPackMatmulPatterns (RewritePatternSet &patterns, const ControlBlockPackMatmulFn &controlFn) |
Patterns to block pack Linalg matmul ops. More... | |
void | populateWinogradConv2DPatterns (RewritePatternSet &patterns, int64_t m, int64_t r) |
Patterns to apply Winograd Conv2D algorithm F(m x m, r x r). More... | |
void | populateDecomposeWinogradOpsPatterns (RewritePatternSet &patterns) |
Patterns to decompose Winograd operators. More... | |
void | populateContractionOpRankReducingPatterns (RewritePatternSet &patterns) |
Adds patterns that reduce the rank of named contraction ops that have unit dimensions in the operand(s) by converting to a sequence of collapse_shape , <corresponding linalg named op> , expand_shape (if on tensors). More... | |
bool | allIndexingsAreProjectedPermutation (LinalgOp op) |
Check if all indexing maps are projected permutations. More... | |
bool | hasOnlyScalarElementwiseOp (Region &r) |
Detect whether r has only ConstantOp, ElementwiseMappable and YieldOp. More... | |
bool | isElementwise (LinalgOp op) |
Check if a LinalgOp is an element-wise operation. More... | |
bool | isParallelIterator (utils::IteratorType iteratorType) |
Check if iterator type has "parallel" semantics. More... | |
bool | isReductionIterator (utils::IteratorType iteratorType) |
Check if iterator type has "reduction" semantics. More... | |
Value | makeComposedPadHighOp (OpBuilder &b, Location loc, RankedTensorType type, Value source, Value pad, bool nofold) |
Create a tensor::PadOp that pads source to the size of the statically sized type whose static sizes are assumed to be greater than the dynamic source size. More... | |
GenericOp | makeTransposeOp (OpBuilder &b, Location loc, Value inputTensor, Value outputTensor, ArrayRef< int64_t > transposeVector) |
Returns a GenericOp that transposes inputTensor into outputTensor using transposeVector to permute the inputTensor dimensions. More... | |
GenericOp | makeMemRefCopyOp (OpBuilder &b, Location loc, Value from, Value to) |
Returns GenericOp that copies an n-D memref. More... | |
std::optional< SmallVector< ReassociationIndices > > | getReassociationMapForFoldingUnitDims (ArrayRef< OpFoldResult > mixedSizes) |
Get the reassociation maps to fold the result of a extract_slice (or source of a insert_slice) operation with given offsets, and sizes to its rank-reduced version. More... | |
SmallVector< OpFoldResult > | computeTileOffsets (OpBuilder &b, Location loc, ArrayRef< OpFoldResult > ivs, ArrayRef< OpFoldResult > tileSizes) |
Computes tile offsets, given a list of loop ivs and tileSizes . More... | |
SmallVector< OpFoldResult > | computeTileSizes (OpBuilder &b, Location loc, ArrayRef< OpFoldResult > tileSizes, ArrayRef< OpFoldResult > sizeBounds) |
Computes tile sizes, given a list of tileSizes and dimension sizes (sizeBounds ). More... | |
SmallVector< Type > | getTensorOutputTypes (LinalgOp op, ValueRange operands) |
Returns the list of tensor output types produced when the given structured operation op is applied to the given operands . More... | |
SmallVector< Value > | insertSlicesBack (OpBuilder &builder, Location loc, LinalgOp op, ValueRange operands, ValueRange results) |
Creates insert_slice ops that insert results back into larger tensors they were originally extracted from with extract_slice before being passed as operands to the given structured operation op or its clone. More... | |
SliceParameters | computeSliceParameters (OpBuilder &builder, Location loc, Value valueToTile, ArrayRef< OpFoldResult > tileSizes, AffineMap map, ArrayRef< OpFoldResult > lbs, ArrayRef< OpFoldResult > ubs, ArrayRef< OpFoldResult > subShapeSizes, bool omitPartialTileCheck) |
Computes SliceParameters for a single valueToTile assuming that its user is being tiled with the given loop bounds lbs and ubs and the tile sizes tileSizes . More... | |
SmallVector< std::optional< SliceParameters > > | computeAllSliceParameters (OpBuilder &builder, Location loc, LinalgOp linalgOp, ValueRange valuesToTile, ArrayRef< OpFoldResult > ivs, ArrayRef< OpFoldResult > tileSizes, ArrayRef< OpFoldResult > sizeBounds, bool omitPartialTileCheck) |
Computes SliceParamaters for all valuesToTile of the given linalgOp , assuming linalgOp is being fused into a loop nest. More... | |
Operation * | makeTiledShape (OpBuilder &builder, Location loc, Value valueToTile, ArrayRef< OpFoldResult > tileSizes, AffineMap map, ArrayRef< OpFoldResult > lbs, ArrayRef< OpFoldResult > ubs, ArrayRef< OpFoldResult > subShapeSizes, bool omitPartialTileCheck) |
Creates an extract_slice/subview op for a single valueToTile with builder . More... | |
SmallVector< Value > | makeTiledShapes (OpBuilder &builder, Location loc, LinalgOp linalgOp, ValueRange valuesToTile, ArrayRef< OpFoldResult > ivs, ArrayRef< OpFoldResult > tileSizes, ArrayRef< OpFoldResult > sizeBounds, bool omitPartialTileCheck) |
Creates extract_slice/subview ops for all valuesToTile of the given linalgOp with builder , assuming linalgOp is being fused into a loop nest for tiling with the given induction variables ivs and tile sizes tileSizes . More... | |
void | offsetIndices (OpBuilder &b, LinalgOp linalgOp, ArrayRef< OpFoldResult > offests) |
Add the specified offsets to any linalg.index ops contained in the given linalgOp . More... | |
void | offsetIndices (RewriterBase &b, LinalgOp linalgOp, ArrayRef< OpFoldResult > offests) |
FailureOr< FusionInfo > | fuseProducerOfTensor (OpBuilder &b, OpOperand &consumerOpOperand) |
This implements the fusion part of the "tileAndFuse on tensors" transformation and thus requires the consumerOpOperand to be a extract_slice op (generally obtained by applying the tiling transformation). More... | |
FailureOr< FusionInfo > | fuseProducerOfTensor (OpBuilder &b, OpResult producerOpResult, OpOperand &consumerOpOperand) |
This implements the fusion part of the "tileAndFuse on tensors" transformation and thus requires the consumerOpOperand to be a extract_slice op (generally obtained by applying the tiling transformation). More... | |
void | updateBoundsForCyclicDistribution (OpBuilder &builder, Location loc, Value procId, Value nprocs, Value &lb, Value &ub, Value &step) |
Update the lb , ub and step to get per processor lb , ub and step . More... | |
template<typename OpTy > | |
SmallVector< NamedAttribute > | getPrunedAttributeList (OpTy op) |
Returns an attribute list that excludes pre-defined attributes. More... | |
static bool | hasAllOneValues (DenseIntElementsAttr attr) |
static Value | createAdd (Location loc, Value x, Value y, OpBuilder &builder) |
static Value | createMul (Location loc, Value x, Value y, Type accType, OpBuilder &builder) |
static SmallVector< Value > | unrollIndex (OpBuilder &b, Location loc, Value index, ArrayRef< int64_t > factors) |
static Value | getConvolvedIndex (OpBuilder &b, Location loc, Value oIndex, Value fIndex, int64_t stride) |
static ReductionKind | getReductionKind (Operation *op) |
static std::optional< Operation * > | getCombinerOp (LinalgOp op) |
static ReductionKind | getReductionKindOfLinalgOp (LinalgOp op) |
static MeshOp | getMesh (Operation *op, ArrayRef< MeshSharding > operandShardings, ArrayRef< MeshSharding > resultShardings, SymbolTableCollection &symbolTable) |
static Value | createDestinationPassingStyleInitOperand (LinalgOp op, Value spmdizedOperand, ArrayRef< MeshAxis > reductionMeshAxes, MeshOp meshOp, ImplicitLocOpBuilder &builder) |
static SmallVector< Value > | createDestinationPassingStyleInitOperands (LinalgOp op, MeshOp meshOp, ArrayRef< Value > spmdizedOperands, ArrayRef< MeshAxis > reductionMeshAxes, IRMapping &spmdizationMap, ImplicitLocOpBuilder &builder) |
static void | createAllReduceForResultWithoutPartialSharding (Value unshardedLinalgOpResult, ArrayRef< MeshAxis > opReductionMeshAxes, MeshSharding resultSharding, ReductionKind reductionKind, IRMapping &spmdizationMap, ImplicitLocOpBuilder &builder) |
static void | createAllReduceForResultsWithoutPartialShardings (LinalgOp unshardedOp, ArrayRef< MeshAxis > opReductionMeshAxes, ArrayRef< MeshSharding > resultShardings, IRMapping &spmdizationMap, ImplicitLocOpBuilder &builder) |
static void | spmdizeLinalgOpWithShardedReduction (LinalgOp op, ArrayRef< Value > spmdizedOperands, ArrayRef< MeshSharding > operandShardings, ArrayRef< MeshSharding > resultShardings, ArrayRef< utils::IteratorType > loopIteratorTypes, ArrayRef< SmallVector< MeshAxis >> meshAxisAssignmentForLoopIterators, IRMapping &spmdizationMap, SymbolTableCollection &symbolTable, ImplicitLocOpBuilder &builder) |
template<typename OpType > | |
static void | registerOne (MLIRContext *ctx) |
template<typename... OpTypes> | |
static void | registerAll (MLIRContext *ctx) |
Variadic helper function. More... | |
void | populateTranposeConv2DPatterns (RewritePatternSet &patterns) |
static void | generateParallelLoopNest (OpBuilder &b, Location loc, ValueRange lbs, ValueRange ubs, ValueRange steps, ArrayRef< utils::IteratorType > iteratorTypes, ArrayRef< linalg::ProcInfo > procInfo, function_ref< void(OpBuilder &, Location, ValueRange)> bodyBuilderFn, SmallVectorImpl< Value > &ivStorage) |
Generates a loop nest consisting of scf.parallel and scf.for, depending on the iteratorTypes. More... | |
static Operation * | materializeTiledShape (OpBuilder &builder, Location loc, Value valueToTile, const SliceParameters &sliceParams) |
using mlir::linalg::AllocBufferCallbackFn = typedef std::function<std::optional<Value>( OpBuilder &b, memref::SubViewOp subView, ArrayRef<Value> boundingSubViewSize, DataLayout &layout)> |
Callback function type used to perform the allocation for the promoted subView
.
In boundingSubViewsize
a best attempt is made to find the smallest constant value for the size of the buffer needed for each dimension. If that is not possible, contains the dynamic size of the subview. The call back should return the buffer to use.
Definition at line 337 of file Transforms.h.
using mlir::linalg::ControlBlockPackMatmulFn = typedef std::function<std::optional<BlockPackMatmulOptions>(linalg::LinalgOp)> |
Function type which is used to control matmul packing.
It is expected to return valid packing configuration for each operation. Lack of packing options indicates that no valid configuration could be assigned and the operation will not be packed.
Definition at line 1206 of file Transforms.h.
using mlir::linalg::ControlFusionFn = typedef std::function<bool(OpOperand *fusedOperand)> |
Function type which is used to control when to stop fusion.
It is expected that OpOperand is not modified in the callback. The OpOperand is not marked as const to allow callers to use non-const methods.
Definition at line 1696 of file Transforms.h.
using mlir::linalg::ControlPropagationFn = typedef std::function<bool(OpOperand *opOperand)> |
Function type which is used to control propagation of tensor.pack/unpack ops.
Definition at line 1708 of file Transforms.h.
using mlir::linalg::ControlSplitReductionFn = typedef std::function<SplitReductionOptions(LinalgOp op)> |
Function signature to control reduction splitting.
This returns SplitReductionOptions
.
Definition at line 441 of file Transforms.h.
using mlir::linalg::CopyCallbackFn = typedef std::function<LogicalResult(OpBuilder &b, Value src, Value dst)> |
Callback function type used to insert copy from original subview to subview of the promoted region for the read operands/subview of promoted region to original subview for the results.
The copy has to happen from src
to dst
.
Definition at line 350 of file Transforms.h.
using mlir::linalg::DeallocBufferCallbackFn = typedef std::function<LogicalResult(OpBuilder &b, Value buffer)> |
Callback function type used to deallocate the buffers used to hold the promoted subview.
Definition at line 343 of file Transforms.h.
using mlir::linalg::GetCollapsableDimensionsFn = typedef std::function<SmallVector<ReassociationIndices>(linalg::LinalgOp)> |
Function type to control generic op dimension collapsing.
It is expected to return an array of ReassociationIndices
representing dimensions that should be merged.
Definition at line 1726 of file Transforms.h.
using mlir::linalg::LinalgLoops = typedef SmallVector<Operation *, 4> |
Definition at line 469 of file Transforms.h.
using mlir::linalg::LoopIndexToRangeIndexMap = typedef DenseMap<int, int> |
Creates a number of ranges equal to the number of non-zero in tileSizes
.
One for each loop of the LinalgOp that is tiled. The tileSizes
argument has one entry per surrounding loop. It uses zero as the convention that a particular loop is not tiled. This convention simplifies implementations by avoiding affine map manipulations. The returned ranges correspond to the loop ranges, in the proper order, that are tiled and for which new loops will be created. Also the function returns a map from loop indices of the LinalgOp to the corresponding non-empty range indices of newly created loops.
Definition at line 801 of file Transforms.h.
using mlir::linalg::MeshAxis = typedef mesh::MeshAxis |
Definition at line 44 of file MeshShardingInterfaceImpl.cpp.
using mlir::linalg::MeshOp = typedef mesh::MeshOp |
Definition at line 48 of file MeshShardingInterfaceImpl.cpp.
using mlir::linalg::MeshSharding = typedef mesh::MeshSharding |
Definition at line 46 of file MeshShardingInterfaceImpl.cpp.
using mlir::linalg::OptimizeCopyFn = typedef std::function<LogicalResult(RewriterBase &, tensor::PadOp, Value)> |
Definition at line 1493 of file Transforms.h.
using mlir::linalg::ProcInfoCallBackFn = typedef std::function<SmallVector<ProcInfo>( OpBuilder &b, Location loc, ArrayRef<Range> parallelLoopRanges)> |
using mlir::linalg::ReductionKind = typedef mesh::ReductionKind |
Definition at line 45 of file MeshShardingInterfaceImpl.cpp.
using mlir::linalg::ShardingArray = typedef mesh::ShardingArray |
Definition at line 47 of file MeshShardingInterfaceImpl.cpp.
using mlir::linalg::TileSizeComputationFunction = typedef std::function<SmallVector<Value, 4>(OpBuilder &, Operation *)> |
Definition at line 185 of file Transforms.h.
|
strong |
Scheme used to distribute loops to processors.
|
strong |
bool mlir::linalg::allIndexingsAreProjectedPermutation | ( | LinalgOp | op | ) |
Check if all indexing maps are projected permutations.
Definition at line 149 of file Utils.cpp.
Referenced by vectorizeLinalgOpPrecondition().
std::optional< Value > mlir::linalg::allocateGPUPrivateMemory | ( | OpBuilder & | builder, |
memref::SubViewOp | subview, | ||
ArrayRef< Value > | sizeBounds, | ||
DataLayout & | |||
) |
Allocate the subview in the GPU private memory.
Definition at line 495 of file Promotion.cpp.
References allocateSubviewGPUMemoryInAddressSpace().
std::optional< Value > mlir::linalg::allocateWorkgroupMemory | ( | OpBuilder & | builder, |
memref::SubViewOp | subview, | ||
ArrayRef< Value > | sizeBounds, | ||
DataLayout & | |||
) |
Allocate the subview in the GPU workgroup memory.
Definition at line 470 of file Promotion.cpp.
References allocateSubviewGPUMemoryInAddressSpace().
bool mlir::linalg::areDimSequencesPreserved | ( | ArrayRef< AffineMap > | maps, |
ArrayRef< ReassociationIndices > | dimSequences | ||
) |
Return true
if all sequences of dimensions specified in dimSequences
are contiguous in all the ranges of the maps
.
Definition at line 1198 of file ElementwiseOpFusion.cpp.
References isDimSequencePreserved().
bool mlir::linalg::areElementwiseOpsFusable | ( | OpOperand * | fusedOperand | ) |
Return true if two linalg.generic
operations with producer/consumer relationship through fusedOperand
can be fused using elementwise op fusion.
Conditions for elementwise fusion of generic operations.
Definition at line 131 of file ElementwiseOpFusion.cpp.
References mlir::IROperand< DerivedT, IRValueT >::get(), mlir::Value::getDefiningOp(), getIndexingMapOfProducerOperandsInCoordinatesOfFusedOp(), mlir::AffineMap::getNumResults(), mlir::detail::IROperandBase::getOwner(), mlir::Value::getType(), and mlir::AffineMap::isPermutation().
Referenced by fuseElementwiseOps().
FailureOr< PackResult > mlir::linalg::blockPackMatmul | ( | RewriterBase & | rewriter, |
linalg::LinalgOp | linalgOp, | ||
const ControlBlockPackMatmulFn & | controlPackMatmul | ||
) |
Pack a matmul operation into blocked 4D layout.
Relayout a matmul operation into blocked layout with two levels of subdivision:
A 2D matmul MxNxK gets reshaped into blocked 4D representation as: [MB][NB][mb][nb] += [MB][KB][mb][kb] * [NB][KB][nb][kb] where the (MB, NB, KB) dimensions represent the major blocks, and the (mb, nb, kb) are the minor blocks of their respective original 2D dimensions (M, N, K).
Depending on the initial operands' data layout and the specified packing options, the major blocks dimensions might get transposed e.g., [MB][KB] -> [KB][MB]. The minor blocks can also be transposed e.g., [mb][kb] -> [kb][mb]. Any present batch dimensions remain unchanged. The final result is unpacked back to the original shape.
Return failure if no valid packing options are provided.
Definition at line 139 of file BlockPackMatmul.cpp.
References mlir::getAsOpFoldResult(), mlir::Builder::getI64ArrayAttr(), inferContractionDims(), mlir::RewriterBase::notifyMatchFailure(), options, packMatmulGreedily(), mlir::OpBuilder::setInsertionPointAfter(), transposePackedMatmul(), and validateFullTilesOnDims().
Value mlir::linalg::bufferizeToAllocation | ( | RewriterBase & | rewriter, |
const BufferizeToAllocationOptions & | options, | ||
bufferization::AllocTensorOp | allocTensorOp, | ||
Attribute | memorySpace = {} , |
||
Operation * | insertionPoint = nullptr |
||
) |
Materialize a buffer allocation for the given bufferization.alloc_tensor op and lower the op to memref.alloc + memref.tensor_store.
In addition to rewriting the IR, this function returns the newly allocated buffer. The insertionPoint
parameter can be used to specify a custom insertion point for the buffer allocation.
Definition at line 323 of file ConvertToDestinationStyle.cpp.
References mlir::OpBuilder::create(), createAllocationForTensor(), options, mlir::RewriterBase::replaceOp(), and mlir::OpBuilder::setInsertionPoint().
Value mlir::linalg::bufferizeToAllocation | ( | RewriterBase & | rewriter, |
const BufferizeToAllocationOptions & | options, | ||
Operation * | op, | ||
Attribute | memorySpace = {} , |
||
Operation * | insertionPoint = nullptr |
||
) |
Bufferize the given op with tensor semantics and materialize the result in a newly allocated buffer.
Only bufferizable ops that bufferize to a memory write or have an aliasing OpOperand (and do not themselves bufferize to an allocation) are supported. They are bufferized using their BufferizableOpInterface implementation.
Selected ops that bufferize to an allocation (or need special handling) are also supported:
This function returns the newly allocated buffer. The insertionPoint
parameter can be used to specify a custom insertion point for the buffer allocation.
Definition at line 471 of file ConvertToDestinationStyle.cpp.
Value mlir::linalg::bufferizeToAllocation | ( | RewriterBase & | rewriter, |
const BufferizeToAllocationOptions & | options, | ||
tensor::PadOp | padOp, | ||
Attribute | memorySpace = {} , |
||
Operation * | insertionPoint = nullptr |
||
) |
Materialize a buffer allocation for the given tensor.pad op and lower the op to linalg.fill/linalg.generic + bufferization.materialize_in_destination.
E.g.:
%0 = tensor.pad low[l] high[h] t ...
is lowered to:
alloc = memref.alloc linalg.fill ... outs(alloc) subview = memref.subview alloc [l] [...] [1] bufferization.materialize_in_destination t in subview %0 = bufferization.to_tensor alloc restrict writable
In addition to rewriting the IR as shown above, this function returns the newly allocated buffer. The insertionPoint
parameter can be used to specify a custom insertion point for the buffer allocation.
Referenced by bufferizeToAllocation().
Value mlir::linalg::bufferizeToAllocation | ( | RewriterBase & | rewriter, |
const BufferizeToAllocationOptions & | options, | ||
vector::MaskOp | maskOp, | ||
Attribute | memorySpace = {} , |
||
Operation * | insertionPoint = nullptr |
||
) |
Materialize a buffer allocation for the given vector.mask op and bufferize the op, including its region.
E.g.:
%0 = vector.mask { vector.transfer_write v, t : vector<16xf32>, tensor<?xf32> } : vector<16xi1> -> tensor<?xf32>
is lowered to:
alloc = memref.alloc bufferization.materialize_in_destination t in subview vector.mask { vector.transfer_write arg0, alloc : vector<16xf32>, memref<?xf32> } : vector<16xi1> %0 = bufferization.to_tensor alloc restrict writable
In addition to rewriting the IR as shown above, this function returns the newly allocated buffer. The insertionPoint
parameter can be used to specify a custom insertion point for the buffer allocation.
Definition at line 261 of file ConvertToDestinationStyle.cpp.
References bufferizeToAllocation(), options, and mlir::OpBuilder::setInsertionPoint().
FailureOr< CollapseResult > mlir::linalg::collapseOpIterationDims | ( | LinalgOp | op, |
ArrayRef< ReassociationIndices > | foldedIterationDims, | ||
RewriterBase & | rewriter | ||
) |
Collapses dimensions of linalg.generic/linalg.copy operation.
Implementation of fusion with reshape operation by collapsing dimensions.
A precondition to calling this method is that for each list in foldedIterationDim
, the sequence of dimensions is contiguous in domains of all indexing_maps
of the linalgOp
. This can be checked using areDimSequencePreserved
method. When valid, the method also collapses the operands of the op. Returns replacement values of the results of the original linalgOp
by inserting reshapes to get back values of compatible types.
Definition at line 1668 of file ElementwiseOpFusion.cpp.
SmallVector< std::optional< SliceParameters > > mlir::linalg::computeAllSliceParameters | ( | OpBuilder & | builder, |
Location | loc, | ||
LinalgOp | linalgOp, | ||
ValueRange | valuesToTile, | ||
ArrayRef< OpFoldResult > | ivs, | ||
ArrayRef< OpFoldResult > | tileSizes, | ||
ArrayRef< OpFoldResult > | sizeBounds, | ||
bool | omitPartialTileCheck | ||
) |
Computes SliceParamaters for all valuesToTile
of the given linalgOp
, assuming linalgOp
is being fused into a loop nest.
Calls computeSliceParameters
for every individual value.
Note that a constant zero in tileSizes
means no tiling at that implicit loop. The number of non-zero values in tileSizes
should be equal to the number of values in ivs
.
Some of the valuesToTile
won't be affected by tiling. For these values, std::nullopt will be returned.
Definition at line 778 of file Utils.cpp.
References computeSliceParameters(), computeTileOffsets(), computeTileSizes(), and isTiled().
Referenced by makeTiledShapes().
FailureOr< ContinuousTileSizeSpecification > mlir::linalg::computeContinuousTileSizes | ( | OpBuilder & | builder, |
TilingInterface | op, | ||
unsigned | dimension, | ||
OpFoldResult | targetSize, | ||
bool | emitAssertions | ||
) |
Definition at line 162 of file Tiling.cpp.
FailureOr< MultiSizeSpecification > mlir::linalg::computeMultiTileSizes | ( | OpBuilder & | builder, |
LinalgOp | op, | ||
unsigned | dimension, | ||
OpFoldResult | targetSize, | ||
OpFoldResult | divisor, | ||
bool | emitAssertions = true |
||
) |
Emits the IR computing the multi-sized tiling specification with two tile sizes not exceeding targetSize
, each divisible by sizeDivisor
, such that there exist numbers of tiles with these sizes that fully cover the given iteration space dimension
of the structured op
.
The computation is as follows:
b = originalTripCount floordiv sizeDivisor t = (targetSize + sizeDivisor - 1) floordiv sizeDivisor d = (b + t - 1) floordiv t s = (b floordiv d) * sizeDivisor v = b % d u = d - v
where the tile sizes are s
and s
+ sizeDivisor
, and the numbers of the corresponding tiles are u
and v
, respectively. Alternatively,
s * u + (s + sizeDivisor) * v == original size, where s mod sizeDivisor = 0.
Expects all values to be positive. In some cases with the target tile size sufficiently close to the dimension shape and non-unit divisor, it is impossible to compute such sizes. If emitAssertion
is set, also emit the assertion that size computation succeeded.
Returns the specification consisting of both tile values and the number of tiles of each size.
Definition at line 268 of file Tiling.cpp.
SliceParameters mlir::linalg::computeSliceParameters | ( | OpBuilder & | builder, |
Location | loc, | ||
Value | valueToTile, | ||
ArrayRef< OpFoldResult > | tileSizes, | ||
AffineMap | map, | ||
ArrayRef< OpFoldResult > | lbs, | ||
ArrayRef< OpFoldResult > | ubs, | ||
ArrayRef< OpFoldResult > | subShapeSizes, | ||
bool | omitPartialTileCheck | ||
) |
Computes SliceParameters for a single valueToTile
assuming that its user is being tiled with the given loop bounds lbs
and ubs
and the tile sizes tileSizes
.
omitPartialTileCheck
controls whether to omit the partial/boundary tile condition check in cases where we statically know that it is unnecessary.
Definition at line 602 of file Utils.cpp.
References mlir::Value::getType(), mlir::linalg::SliceParameters::offsets, mlir::linalg::SliceParameters::sizes, and mlir::linalg::SliceParameters::strides.
Referenced by computeAllSliceParameters(), and makeTiledShape().
FailureOr< StaticContinuousTileSizeSpecification > mlir::linalg::computeStaticContinuousTileSizes | ( | LinalgOp | op, |
unsigned | dimension, | ||
unsigned | targetSize | ||
) |
Definition at line 111 of file Tiling.cpp.
FailureOr< StaticMultiSizeSpecification > mlir::linalg::computeStaticMultiTileSizes | ( | LinalgOp | op, |
unsigned | dimension, | ||
int64_t | targetSize, | ||
int64_t | divisor | ||
) |
Definition at line 242 of file Tiling.cpp.
SmallVector< OpFoldResult > mlir::linalg::computeTileOffsets | ( | OpBuilder & | b, |
Location | loc, | ||
ArrayRef< OpFoldResult > | ivs, | ||
ArrayRef< OpFoldResult > | tileSizes | ||
) |
Computes tile offsets, given a list of loop ivs
and tileSizes
.
In case a tile size is zero (i.e., no tiling), the corresponding offset is also zero.
Definition at line 710 of file Utils.cpp.
References mlir::Builder::getIndexAttr(), isTiled(), and mlir::isZeroIndex().
Referenced by computeAllSliceParameters().
SmallVector< OpFoldResult > mlir::linalg::computeTileSizes | ( | OpBuilder & | b, |
Location | loc, | ||
ArrayRef< OpFoldResult > | tileSizes, | ||
ArrayRef< OpFoldResult > | sizeBounds | ||
) |
Computes tile sizes, given a list of tileSizes
and dimension sizes (sizeBounds
).
In case a tile size is zero (i.e., no tiling), the corresponding result size is the corresponding value from sizeBounds
. Note: The returned tile sizes are closed intervals.
Definition at line 724 of file Utils.cpp.
References mlir::getAffineDimExpr(), mlir::Builder::getContext(), isTiled(), mlir::isZeroIndex(), and mlir::affine::makeComposedFoldedAffineApply().
Referenced by computeAllSliceParameters().
SmallVector< AffineExpr, 4 > mlir::linalg::concat | ( | ArrayRef< AffineExpr > | a, |
ArrayRef< AffineExpr > | b | ||
) |
Return the vector that is the concatenation of a
and b
.
Definition at line 2276 of file LinalgOps.cpp.
Referenced by mlir::presburger::Simplex::makeProduct().
Normal copy to between src and dst.
Definition at line 503 of file Promotion.cpp.
References mlir::OpBuilder::create(), and mlir::Value::getLoc().
Create Memref copy operations and add gpu barrier guards before and after the copy operation to ensure data integrity.
Definition at line 486 of file Promotion.cpp.
References mlir::OpBuilder::create(), and mlir::Value::getLoc().
Definition at line 32 of file ConvertConv2DToImg2Col.cpp.
References mlir::OpBuilder::create(), and mlir::Value::getType().
Referenced by rewriteInIm2Col().
|
static |
Definition at line 206 of file MeshShardingInterfaceImpl.cpp.
References createAllReduceForResultWithoutPartialSharding(), and getReductionKindOfLinalgOp().
|
static |
Definition at line 185 of file MeshShardingInterfaceImpl.cpp.
References mlir::ImplicitLocOpBuilder::create(), mlir::mesh::MeshSharding::getMesh(), mlir::mesh::MeshSharding::getPartialAxes(), mlir::IRMapping::lookup(), and mlir::IRMapping::map().
Referenced by createAllReduceForResultsWithoutPartialShardings().
|
static |
Definition at line 131 of file MeshShardingInterfaceImpl.cpp.
|
static |
Definition at line 169 of file MeshShardingInterfaceImpl.cpp.
OpFoldResult mlir::linalg::createFoldedDimOp | ( | OpBuilder & | b, |
Location | loc, | ||
Value | val, | ||
int64_t | dim | ||
) |
Create one memref::DimOp or tensor::DimOp depending on the type of val
.
This is a polymorphic convenience function to abstract away the rank and concrete type of val
. Asserts that val
is a memref or tensor type.
Definition at line 99 of file LinalgOps.cpp.
References createOrFoldDimOp(), mlir::Builder::getIndexAttr(), and mlir::Value::getType().
Referenced by fuse().
|
static |
Definition at line 40 of file ConvertConv2DToImg2Col.cpp.
References mlir::convertScalarToDtype(), and mlir::OpBuilder::create().
Referenced by rewriteInIm2Col().
Create one memref::DimOp or tensor::DimOp depending on the type of val
.
This is a polymorphic convenience function to abstract away the rank and concrete type of val
. Asserts that val
is a memref or tensor type.
Definition at line 90 of file LinalgOps.cpp.
References mlir::OpBuilder::createOrFold(), and mlir::Value::getType().
Referenced by concatSizesFromInputs(), createFoldedDimOp(), and mlir::sparse_tensor::sizesFromSrc().
In case of GPU private memory there is no need to deallocate since the memory is freed when going outside of the scope.
Definition at line 511 of file Promotion.cpp.
In case of GPU group memory there is no need to deallocate.
Definition at line 479 of file Promotion.cpp.
FailureOr< Operation * > mlir::linalg::decomposeWinogradFilterTransformOp | ( | RewriterBase & | rewriter, |
linalg::WinogradFilterTransformOp | op | ||
) |
Rewrite linalg.winograd_filter_transform.
The data layout of the filter is FHWC. The transformation matrix is 2-dimension. We need to extract H x W from FHWC first. We generate 2 levels of loops to iterate on F and C. After the rewriting, we get
scf.for f = lo_f to hi_f step 1 scf.for c = lo_c to hi_c step 1 extracted = extract filter<h x w> from filter<f x h x w x c> ret = linalg.matmul G, extracted ret = linalg.matmul ret, GT inserted = insert ret into filter<h x w x c x f>
Definition at line 1195 of file WinogradConv2D.cpp.
FailureOr< Operation * > mlir::linalg::decomposeWinogradInputTransformOp | ( | RewriterBase & | rewriter, |
linalg::WinogradInputTransformOp | op | ||
) |
Rewrite linalg.winograd_input_transform.
The data layout of the input is NHWC. The transformation matrix is 2-dimension. We need to extract H x W from NHWC first. We generate 4 levels of loops to iterate on N, C, tileH, and tileW. After the rewriting, we get
scf.for h = 0 to tileH step 1 scf.for w = 0 to tileW step 1 scf.for n = 0 to N step 1 scf.for c = 0 to C step 1 extracted = extract extracted<alphaH x alphaW> from input<N x H x W x C> at [n, (h x m), (w x m), c] ret = linalg.matmul BT, extracted ret = linalg.matmul ret, B inserted = insert ret<alphaH x alphaW> into output<alphaH x alphaW x tileH x tileW x N x C> at [0, 0, h, w, n, c]
Definition at line 1201 of file WinogradConv2D.cpp.
FailureOr< Operation * > mlir::linalg::decomposeWinogradOutputTransformOp | ( | RewriterBase & | rewriter, |
linalg::WinogradOutputTransformOp | op | ||
) |
Rewrite linalg.winograd_output_transform.
The data layout of the output is HWNF. The transformation matrix is 2-dimension. We need to extract H x W from HWNF first. We generate 4 levels of loops to iterate on N, F, tileH, and tileW. After the transformation, we get
scf.for h = 0 to tileH step 1 scf.for w = 0 to tileW step 1 scf.for n = 0 to N step 1 scf.for f = 0 to F step 1 extracted = extract extracted<alphaH x alphaW> from input<alphaH x alphaW x tileH x tileW x N x F> at [0, 0, h, w, n, f] ret = linalg.matmul AT, extracted ret = linalg.matmul ret, A inserted = insert ret<alphaH x alphaW> into output<N x H x W x F> at [n, (h x m), (w x m), f]
Definition at line 1207 of file WinogradConv2D.cpp.
FailureOr< DropUnitDimsResult > mlir::linalg::dropUnitDims | ( | RewriterBase & | rewriter, |
GenericOp | genericOp, | ||
const ControlDropUnitDims & | options | ||
) |
Definition at line 390 of file DropUnitDims.cpp.
References collapseValue(), mlir::concatAffineMaps(), mlir::OpBuilder::create(), dropUnitExtentFromOperandMetadata(), mlir::detail::enumerate(), expandValue(), mlir::getAffineConstantExpr(), mlir::getAffineDimExpr(), mlir::Builder::getContext(), mlir::AffineMap::getResults(), mlir::getType(), mlir::RewriterBase::inlineRegionBefore(), mlir::inversePermutation(), mlir::RewriterBase::notifyMatchFailure(), options, mlir::AffineMap::replaceDimsAndSymbols(), and replaceUnitDimIndexOps().
AffineMap mlir::linalg::extractOrIdentityMap | ( | std::optional< AffineMap > | maybeMap, |
unsigned | rank, | ||
MLIRContext * | context | ||
) |
Returns maybeMap.get()
if maybeMap
is set, otherwise returns the symbol-less identity map of rank
.
Definition at line 2256 of file LinalgOps.cpp.
References mlir::AffineMap::get(), and mlir::AffineMap::getMultiDimIdentityMap().
FailureOr< mlir::linalg::ElementwiseOpFusionResult > mlir::linalg::fuseElementwiseOps | ( | RewriterBase & | rewriter, |
OpOperand * | fusedOperand | ||
) |
Find the results of the producer that have uses outside of the consumer, after the fusion.
Definition at line 331 of file ElementwiseOpFusion.cpp.
References areElementwiseOpsFusable(), mlir::AffineMap::compose(), mlir::OpBuilder::create(), mlir::detail::enumerate(), mlir::RewriterBase::eraseOp(), mlir::linalg::ElementwiseOpFusionResult::fusedOp, generateFusedElementwiseOpRegion(), mlir::IROperand< DerivedT, IRValueT >::get(), mlir::Builder::getAffineMapArrayAttr(), getIndexingMapOfProducerOperandsInCoordinatesOfFusedOp(), mlir::detail::IROperandBase::getOwner(), getPreservedProducerResults(), mlir::inversePermutation(), mlir::RewriterBase::notifyMatchFailure(), and mlir::linalg::ElementwiseOpFusionResult::replacements.
FailureOr< FusionInfo > mlir::linalg::fuseProducerOfTensor | ( | OpBuilder & | b, |
OpOperand & | consumerOpOperand | ||
) |
This implements the fusion part of the "tileAndFuse on tensors" transformation and thus requires the consumerOpOperand
to be a extract_slice
op (generally obtained by applying the tiling transformation).
Definition at line 227 of file Fusion.cpp.
References mlir::IROperand< DerivedT, IRValueT >::get(), and getProducerOfTensor().
FailureOr< FusionInfo > mlir::linalg::fuseProducerOfTensor | ( | OpBuilder & | b, |
OpResult | producerOpResult, | ||
OpOperand & | consumerOpOperand | ||
) |
This implements the fusion part of the "tileAndFuse on tensors" transformation and thus requires the consumerOpOperand
to be a extract_slice
op (generally obtained by applying the tiling transformation).
Assumes producerOfTensor
is a Linalg op that produces consumerOpOperand
.
Definition at line 239 of file Fusion.cpp.
References mlir::OpBuilder::create(), fuse(), mlir::IROperand< DerivedT, IRValueT >::get(), mlir::Value::getDefiningOp(), mlir::detail::IROperandBase::getOwner(), mlir::OpResult::getOwner(), mlir::Value::getParentBlock(), mlir::OpResult::getResultNumber(), mlir::Value::getType(), mlir::IROperand< DerivedT, IRValueT >::set(), and mlir::OpBuilder::setInsertionPoint().
FailureOr< GenericOp > mlir::linalg::generalizeNamedOp | ( | RewriterBase & | rewriter, |
LinalgOp | namedOp | ||
) |
Create a GenericOp from the given named operation namedOp
and replace namedOp.
Return failure if namedOp
is a GenericOp or misses a region builder.
Definition at line 53 of file Generalization.cpp.
References mlir::OpBuilder::create(), generalizeNamedOpPrecondition(), mlir::RewriterBase::inlineRegionBefore(), mlir::RewriterBase::notifyMatchFailure(), and mlir::RewriterBase::replaceOp().
Referenced by packMatmulGreedily().
std::string mlir::linalg::generateLibraryCallName | ( | Operation * | op | ) |
Returns the name mangled library call name to disambiguate between different overloads at the C level.
The name mangling scheme is basic and uses MLIR type names:
linalg.
prefix, and the <
, >
, ?
symbols from the type. Assumes op
is a LinalgOp.Examples:
linalg_fill_f32_viewf32
linalg_dot_viewxf32_viewxf32_viewf32
linalg_matmul_viewxxf32_viewxxf32_viewxxf32
Definition at line 2317 of file LinalgOps.cpp.
|
static |
Generates a loop nest consisting of scf.parallel and scf.for, depending on the iteratorTypes.
Consecutive parallel loops create a single scf.parallel operation; each sequential loop creates a new scf.for operation. The body of the innermost loop is populated by bodyBuilderFn
that accepts a range of induction variables for all loops. ivStorage
is used to store the partial list of induction variables.
Definition at line 407 of file Utils.cpp.
References mlir::ArithBuilder::_and(), mlir::scf::buildLoopNest(), mlir::OpBuilder::create(), mlir::linalg::ProcInfo::distributionMethod, isParallelIterator(), None, and mlir::ArithBuilder::slt().
Referenced by mlir::linalg::GenerateLoopNest< LoopTy >::doit().
|
static |
Definition at line 81 of file MeshShardingInterfaceImpl.cpp.
std::optional< vector::CombiningKind > mlir::linalg::getCombinerOpKind | ( | Operation * | combinerOp | ) |
Return vector::CombiningKind for the given op.
Definition at line 513 of file Vectorization.cpp.
Referenced by buildMultiDimReduce().
|
static |
Definition at line 70 of file ConvertConv2DToImg2Col.cpp.
References mlir::bindSymbols(), mlir::AffineMap::get(), mlir::Builder::getContext(), and mlir::affine::makeComposedAffineApply().
Referenced by rewriteInIm2Col().
RewritePatternSet mlir::linalg::getLinalgTilingCanonicalizationPatterns | ( | MLIRContext * | ctx | ) |
Canonicalization patterns relevant to apply after tiling patterns.
These are applied automatically by the tiling pass but need to be applied manually when tiling is called programmatically.
Definition at line 858 of file Tiling.cpp.
References populateLinalgTilingCanonicalizationPatterns().
|
static |
Definition at line 105 of file MeshShardingInterfaceImpl.cpp.
llvm::SmallDenseSet< int > mlir::linalg::getPreservedProducerResults | ( | GenericOp | producer, |
GenericOp | consumer, | ||
OpOperand * | fusedOperand | ||
) |
Returns a set of indices of the producer's results which would be preserved after the fusion.
Definition at line 104 of file ElementwiseOpFusion.cpp.
References mlir::detail::enumerate(), and isOpOperandCanBeDroppedAfterFusedLinalgs().
Referenced by fuseElementwiseOps().
SmallVector<NamedAttribute> mlir::linalg::getPrunedAttributeList | ( | OpTy | op | ) |
std::optional< SmallVector< ReassociationIndices > > mlir::linalg::getReassociationMapForFoldingUnitDims | ( | ArrayRef< OpFoldResult > | mixedSizes | ) |
Get the reassociation maps to fold the result of a extract_slice (or source of a insert_slice) operation with given offsets, and sizes to its rank-reduced version.
This is only done for the cases where the size is 1 and offset is 0. Strictly speaking the offset 0 is not required in general, but non-zero offsets are not handled by SPIR-V backend at this point (and potentially cannot be handled).
Definition at line 887 of file Utils.cpp.
References mlir::detail::enumerate().
|
static |
Definition at line 51 of file MeshShardingInterfaceImpl.cpp.
|
static |
Definition at line 91 of file MeshShardingInterfaceImpl.cpp.
Referenced by createAllReduceForResultsWithoutPartialShardings().
SmallVector< Type > mlir::linalg::getTensorOutputTypes | ( | LinalgOp | op, |
ValueRange | operands | ||
) |
|
static |
Definition at line 27 of file ConvertConv2DToImg2Col.cpp.
Referenced by rewriteInIm2Col().
bool mlir::linalg::hasOnlyScalarElementwiseOp | ( | Region & | r | ) |
FailureOr< Value > mlir::linalg::hoistPaddingOnTensors | ( | RewriterBase & | rewriter, |
tensor::PadOp | opToHoist, | ||
int64_t | numLoops, | ||
ArrayRef< int64_t > | transposeVector, | ||
tensor::PadOp & | hoistedOp, | ||
SmallVectorImpl< GenericOp > & | transposeOps | ||
) |
Mechanically hoist padding operations on tensors by numLoops
into a new, generally larger tensor.
This achieves packing of multiple padding ops into a larger tensor. On success, opToHoist
is replaced by the cloned version in the packing loop so the caller can continue reasoning about the padding operation. If transposeVector
is non-empty, hoist padding introduces a GenericOp to transpose the padded tensor before inserting it into the packed tensor. A transposeVector
can change the storage order of the padded tensor but does not change the order of the pack or compute loops.
TODO: In the future, we should consider rewriting as a tensor.pack after hoisting since this abstraction is now available.
If hoistPaddingOnTensors is called with nLoops
= 2 on the following IR.
IR resembling the following is produced:
Construct the packing loop nest.
Definition at line 938 of file HoistPadding.cpp.
References buildPackingLoopNestImpl(), mlir::tensor::computeTransposedType(), mlir::OpBuilder::create(), DBGS, mlir::Value::getDefiningOp(), mlir::Operation::getParentOfType(), makeTransposeOp(), replaceByPackingResult(), and mlir::OpBuilder::setInsertionPointAfter().
Referenced by hoistPaddingOnTensors(), and padAndHoistLinalgOp().
FailureOr< Value > mlir::linalg::hoistPaddingOnTensors | ( | tensor::PadOp | opToHoist, |
int64_t | numLoops, | ||
ArrayRef< int64_t > | transposeVector, | ||
tensor::PadOp & | hoistedOp, | ||
SmallVectorImpl< GenericOp > & | transposeOps | ||
) |
Calls into hoistPaddingOnTensors
with a local IRRewriter.
Definition at line 1003 of file HoistPadding.cpp.
References hoistPaddingOnTensors().
void mlir::linalg::hoistRedundantVectorBroadcasts | ( | RewriterBase & | rewriter, |
Operation * | root | ||
) |
Hoist vector.extract/vector.broadcast pairs out of immediately enclosing scf::ForOp iteratively, if the following conditions are met:
moveLoopInvariantCode
helper function on the candidate loop above which to hoist. Definition at line 97 of file Hoisting.cpp.
References mlir::WalkResult::advance(), broadcast(), DBGS, mlir::IROperand< DerivedT, IRValueT >::get(), mlir::Value::hasOneUse(), mlir::WalkResult::interrupt(), mlir::RewriterBase::modifyOpInPlace(), mlir::moveLoopInvariantCode(), mlir::RewriterBase::moveOpAfter(), mlir::RewriterBase::replaceAllUsesWith(), replaceWithDifferentYield(), and mlir::Operation::walk().
void mlir::linalg::hoistRedundantVectorTransfers | ( | Operation * | root | ) |
Hoist vector.transfer_read/vector.transfer_write on buffers pairs out of immediately enclosing scf::ForOp iteratively, if the following conditions are true:
moveLoopInvariantCode
helper function on the candidate loop above which to hoist. Hoisting the transfers results in scf::ForOp yielding the value that originally transited through memory.TODO: To further improve hoisting opportunities, fold aliasing memref operations into respective vector.transfer{read|write} operations and avoid using ops implementing ViewLikeOpInterface as the source for transfer Ops.
WARNING: This hoisting does not model parallelism and is generally incorrect when used on distributed loops with memref semantics!
Definition at line 202 of file Hoisting.cpp.
References mlir::WalkResult::advance(), DBGS, mlir::getForwardSlice(), mlir::WalkResult::interrupt(), mlir::vector::isDisjointTransferSet(), mlir::moveLoopInvariantCode(), noAliasingUseInLoop(), mlir::DominanceInfo::properlyDominates(), and mlir::Operation::walk().
FailureOr< ContractionDimensions > mlir::linalg::inferContractionDims | ( | ArrayRef< AffineMap > | indexingMaps | ) |
Definition at line 380 of file LinalgInterfaces.cpp.
References inferContractionDimsImpl(), and inferIteratorsFromOutMap().
FailureOr< ContractionDimensions > mlir::linalg::inferContractionDims | ( | LinalgOp | linalgOp | ) |
Find at least 2 parallel (m and n) and 1 reduction (k) dimension candidates that form a matmul subcomputation within linalgOp
.
These dimensions are such that:
linalgOp
with some orthogonal heuristic. When multiple dimension occurrences exist that match batch
, m
, n
, or k
, indices are returned in sorted order. Returns a failure if any of m
, n
or k
is empty. Definition at line 372 of file LinalgInterfaces.cpp.
References inferContractionDimsImpl().
Referenced by blockPackMatmul(), packMatmulGreedily(), and validateFullTilesOnDims().
FailureOr< ConvolutionDimensions > mlir::linalg::inferConvolutionDims | ( | LinalgOp | linalgOp | ) |
Find at least 1 parallel (output_image) and reduction (filter_loop) dimension candidates that form a convolution subcomputation within linalgOp
.
The LHS is assumed to be the convolution input while the RHS is assumed as the filter. These dimensions are such that:
linalgOp
with some orthogonal heuristic. When multiple dimension occurrences exist that match any classification indices are returned in sorted order. Returns a failure if output_image
(and implicitly filter_loop
) is empty. Definition at line 740 of file LinalgInterfaces.cpp.
References inferConvolutionDimsImpl().
SmallVector< Value > mlir::linalg::insertSlicesBack | ( | OpBuilder & | builder, |
Location | loc, | ||
LinalgOp | op, | ||
ValueRange | operands, | ||
ValueRange | results | ||
) |
Creates insert_slice
ops that insert results
back into larger tensors they were originally extracted from with extract_slice
before being passed as operands
to the given structured operation op
or its clone.
Note that operands
are not necessarily the actual operands of op
, the operation serves only as metadata container for operand types and positions.
FailureOr< GenericOp > mlir::linalg::interchangeGenericOp | ( | RewriterBase & | rewriter, |
GenericOp | genericOp, | ||
ArrayRef< unsigned > | interchangeVector | ||
) |
Interchange the iterator_types
and iterator_maps
dimensions and adapts the index accesses of op
.
This is an in-place transformation controlled by interchangeVector
. An empty vector is interpreted as the identity permutation and the transformation returns early.
E.g. the permutation (i,j,k) -> (j,k,i)
is expressed with interchangeVector = [1,2,0]
. All values in interchangeVector
must be integers, in the range 0..op.rank
without duplications (i.e. [1,1,2]
is an invalid permutation).
Return failure if the permutation is not valid.
Definition at line 50 of file Interchange.cpp.
References mlir::RewriterBase::finalizeOpModification(), mlir::AffineMap::getPermutationMap(), interchangeGenericOpPrecondition(), mlir::inversePermutation(), mlir::RewriterBase::notifyMatchFailure(), and mlir::RewriterBase::startOpModification().
Referenced by packMatmulGreedily().
bool mlir::linalg::isaContractionOpInterface | ( | LinalgOp | linalgOp | ) |
Checks whether linalgOp
conforms to ContractionOpInterface.
Definition at line 453 of file LinalgInterfaces.cpp.
bool mlir::linalg::isaConvolutionOpInterface | ( | LinalgOp | linalgOp, |
bool | allowEmptyConvolvedDims = false |
||
) |
Checks whether linalgOp
conforms to ConvolutionOpInterface.
By default, we require the linalgOp
to have non-empty convolved dims (implicitly non-empty output_image
and filter_loop
). Users can loosen the constraint by setting allowEmptyConvolvedDims
to true
Definition at line 929 of file LinalgInterfaces.cpp.
References mlir::linalg::detail::isConvolutionInterfaceImpl(), and mlir::linalg::detail::Success.
bool mlir::linalg::isaCopyOpInterface | ( | LinalgOp | linalgOp | ) |
Checks whether linalgOp
is semantically equivalent to a linalg.copyOp
.
Definition at line 56 of file LinalgInterfaces.cpp.
Referenced by specializeGenericOp().
bool mlir::linalg::isaElemwiseSingleBinaryOpInterface | ( | GenericOp | genericOp | ) |
Checks whether genericOp
is semantically equivalent to a single linalg elementwise binary op e.g.
linalg.sub.
bool mlir::linalg::isaElemwiseSingleUnaryOpInterface | ( | GenericOp | genericOp | ) |
Checks whether a given genericOp
is semantically equivalent to a single linalgelementwise unary op.
e.g. linalg.exp. A linalg.generic body could be a series of unary elementwise ops e.g. exp(neg(x))
, such as formed by linalg op fusion. Here we restrict it to detecting cases where body is is a single computation op.
Referenced by specializeGenericOp().
std::optional< Value > mlir::linalg::isaFillOpInterface | ( | GenericOp | genericOp | ) |
Checks whether genericOp
is semantically equivalent to a linalg.fill
.
Returns the scalar fill value if true.
Definition at line 76 of file LinalgInterfaces.cpp.
References mlir::Block::back(), mlir::IROperand< DerivedT, IRValueT >::get(), mlir::Block::getArgument(), and mlir::Block::getOperations().
Referenced by specializeGenericOp().
bool mlir::linalg::isDimSequencePreserved | ( | AffineMap | indexingMap, |
ReassociationIndicesRef | dimSequence | ||
) |
Return true
if a given sequence of dimensions are contiguous in the range of the specified indexing map.
For a given dimSequence
, check if the sequence is conserved in the indexingMap
.
indexingMap
is expected to be a projected permutation. Non-existence of the sequence returns true as well.
Definition at line 1157 of file ElementwiseOpFusion.cpp.
References mlir::detail::enumerate(), mlir::AffineMap::getNumResults(), mlir::AffineMap::getResult(), mlir::AffineMap::getResults(), and mlir::AffineMap::isProjectedPermutation().
Referenced by areDimSequencesPreserved().
bool mlir::linalg::isElementwise | ( | LinalgOp | op | ) |
Check if a LinalgOp is an element-wise operation.
Definition at line 169 of file Utils.cpp.
Referenced by vectorizeLinalgOpPrecondition().
bool mlir::linalg::isParallelIterator | ( | utils::IteratorType | iteratorType | ) |
Check if iterator type has "parallel" semantics.
Definition at line 184 of file Utils.cpp.
Referenced by generateParallelLoopNest(), and getCollapsableIterationSpaceDims().
bool mlir::linalg::isReductionIterator | ( | utils::IteratorType | iteratorType | ) |
Check if iterator type has "reduction" semantics.
Definition at line 188 of file Utils.cpp.
Referenced by getCollapsableIterationSpaceDims(), getDimsToReduce(), and mlir::sparse_tensor::CodegenEnv::isAdmissibleTensorExp().
LogicalResult mlir::linalg::linalgOpAnchoredEmptyTensorEliminationStep | ( | RewriterBase & | rewriter, |
Operation * | op, | ||
bufferization::OneShotAnalysisState & | state | ||
) |
Try to eliminate tensor::EmptyOps inside op
that are anchored on a LinalgOp.
This transforms looks for LinalgOps that have an unused output operand and an input operand that is rooted in a tensor::EmptyOp. The tensor::EmptyOp uses are replaced with the output operand and the two operands of the LinalgOp are swapped.
Example: %0 = tensor.empty() %1 = linalg.matmul ins(...) outs(%0) %2 = linalg.generic ins(%1) outs(dest) { ^bb0(in: f32, out: f32): // out not used }
The IR is transformed as follows: %0 = tensor.empty() %1 = linalg.matmul ins(...) outs(dest) %2 = linalg.generic ins(%0) outs(%1) { ^bb0(in: f32, out: f32): // Use out instead of in }
The "ins" operand has no uses inside the body of the LinalgOp and can be folded away with existing cleanup patterns. Afterwards, the tensor::EmptyOp can also fold away.
Definition at line 40 of file EliminateEmptyTensors.cpp.
FailureOr< LinalgLoops > mlir::linalg::linalgOpToAffineLoops | ( | RewriterBase & | rewriter, |
LinalgOp | linalgOp | ||
) |
FailureOr< LinalgLoops > mlir::linalg::linalgOpToLoops | ( | RewriterBase & | rewriter, |
LinalgOp | linalgOp | ||
) |
FailureOr< LinalgLoops > mlir::linalg::linalgOpToParallelLoops | ( | RewriterBase & | rewriter, |
LinalgOp | linalgOp | ||
) |
FailureOr< LowerPackResult > mlir::linalg::lowerPack | ( | RewriterBase & | rewriter, |
tensor::PackOp | packOp | ||
) |
Rewrite pack as pad + reshape + transpose.
Definition at line 219 of file Transforms.cpp.
References mlir::applyPermutationToVector(), mlir::bindDims(), mlir::bindSymbols(), mlir::OpBuilder::create(), DBGS, DBGSNL, mlir::AffineMap::get(), mlir::Builder::getContext(), mlir::getElementTypeOrSelf(), mlir::Builder::getIndexAttr(), mlir::tensor::getMixedSize(), mlir::tensor::getMixedSizes(), mlir::tensor::getPackInverseDestPerm(), mlir::Builder::getZeroAttr(), mlir::invertPermutationVector(), mlir::isRankReducedType(), mlir::affine::makeComposedFoldedAffineApply(), mlir::RewriterBase::notifyMatchFailure(), mlir::RewriterBase::replaceOp(), mlir::OpBuilder::setInsertionPoint(), mlir::RankedTensorType::Builder::setShape(), and mlir::Success.
FailureOr< LowerUnPackOpResult > mlir::linalg::lowerUnPack | ( | RewriterBase & | rewriter, |
tensor::UnPackOp | unPackOp | ||
) |
Rewrite pack as empty + transpose + reshape + extract_slice.
Definition at line 354 of file Transforms.cpp.
References mlir::applyPermutationToVector(), mlir::OpBuilder::create(), DBGS, DBGSNL, mlir::Builder::getIndexAttr(), mlir::tensor::getMixedSizes(), mlir::tensor::getUnPackInverseSrcPerm(), mlir::RewriterBase::replaceOp(), mlir::OpBuilder::setInsertionPoint(), and mlir::RankedTensorType::Builder::setShape().
SmallVector< AffineExpr, 4 > mlir::linalg::makeAffineDimExprs | ( | unsigned | num, |
unsigned & | startIdx, | ||
MLIRContext * | context | ||
) |
Returns num
AffineDimExpr dimensions at positions [startIdx, startIdx + num) and increments startIdx
to startIdx + num
.
Definition at line 2267 of file LinalgOps.cpp.
References mlir::getAffineDimExpr().
Value mlir::linalg::makeComposedPadHighOp | ( | OpBuilder & | b, |
Location | loc, | ||
RankedTensorType | type, | ||
Value | source, | ||
Value | pad, | ||
bool | nofold | ||
) |
Create a tensor::PadOp that pads source
to the size of the statically sized type
whose static sizes are assumed to be greater than the dynamic source
size.
The padding introduces trailing pad
values until the target size is met. If source
is defined by one or more LinalgOps that have been padded with the same value and sizes, return their padded result instead of creating a tensor::PadOp.
Example:
makeComposedPadHighOp(source=%3, pad=cst) returns %2 makeComposedPadHighOp(source=%3, pad=other_cst) returns %4
Definition at line 192 of file Utils.cpp.
References mlir::tensor::createPadHighOp(), mlir::Value::getDefiningOp(), mlir::OpResult::getResultNumber(), mlir::m_Constant(), and mlir::matchPattern().
Referenced by padOperandToSmallestStaticBoundingBox().
Returns GenericOp that copies an n-D memref.
Unlike the current implementation of memref::CopyOp, this op can further tile, lower to loops or vectorize.
Definition at line 287 of file Utils.cpp.
References mlir::OpBuilder::create(), mlir::Builder::getContext(), mlir::AffineMap::getMultiDimIdentityMap(), and mlir::Value::getType().
std::tuple< SmallVector< Range, 4 >, LoopIndexToRangeIndexMap > mlir::linalg::makeTiledLoopRanges | ( | RewriterBase & | b, |
Location | loc, | ||
AffineMap | map, | ||
ArrayRef< OpFoldResult > | allShapeSizes, | ||
ArrayRef< OpFoldResult > | allTileSizes | ||
) |
Definition at line 49 of file Tiling.cpp.
References mlir::getConstantIntValue(), mlir::Builder::getIndexAttr(), mlir::AffineMap::getNumResults(), and mlir::affine::makeComposedFoldedMultiResultAffineApply().
Operation * mlir::linalg::makeTiledShape | ( | OpBuilder & | builder, |
Location | loc, | ||
Value | valueToTile, | ||
ArrayRef< OpFoldResult > | tileSizes, | ||
AffineMap | map, | ||
ArrayRef< OpFoldResult > | lbs, | ||
ArrayRef< OpFoldResult > | ubs, | ||
ArrayRef< OpFoldResult > | subShapeSizes, | ||
bool | omitPartialTileCheck | ||
) |
Creates an extract_slice/subview op for a single valueToTile
with builder
.
This new operation extracts a tile of valueToTile
, starting at offsets lbs
and with sizes subShapeSizes
. omitPartialTileCheck
controls whether to omit the partial/boundary tile condition check in cases where we statically know that it is unnecessary.
Definition at line 589 of file Utils.cpp.
References computeSliceParameters(), and materializeTiledShape().
SmallVector< Value > mlir::linalg::makeTiledShapes | ( | OpBuilder & | builder, |
Location | loc, | ||
LinalgOp | linalgOp, | ||
ValueRange | valuesToTile, | ||
ArrayRef< OpFoldResult > | ivs, | ||
ArrayRef< OpFoldResult > | tileSizes, | ||
ArrayRef< OpFoldResult > | sizeBounds, | ||
bool | omitPartialTileCheck | ||
) |
Creates extract_slice/subview ops for all valuesToTile
of the given linalgOp
with builder
, assuming linalgOp
is being fused into a loop nest for tiling with the given induction variables ivs
and tile sizes tileSizes
.
sizeBounds
are the iteration space bounds for all the implicit loops in linalgOp
. omitPartialTileCheck
controls whether to omit the partial/boundary tile condition check in cases where we statically know that it is unnecessary.
Note that a constant zero in tileSizes
means no tiling at that implicit loop. The number of non-zero values in tileSizes
should be equal to the number of values in ivs
.
Definition at line 829 of file Utils.cpp.
References computeAllSliceParameters(), and materializeTiledShape().
Referenced by fuse().
GenericOp mlir::linalg::makeTransposeOp | ( | OpBuilder & | b, |
Location | loc, | ||
Value | inputTensor, | ||
Value | outputTensor, | ||
ArrayRef< int64_t > | transposeVector | ||
) |
Returns a GenericOp that transposes inputTensor
into outputTensor
using transposeVector
to permute the inputTensor
dimensions.
Definition at line 252 of file Utils.cpp.
References mlir::OpBuilder::create(), mlir::OpBuilder::createBlock(), mlir::Block::getArgument(), mlir::Builder::getContext(), mlir::AffineMap::getMultiDimIdentityMap(), mlir::AffineMap::getPermutationMap(), mlir::Value::getType(), mlir::inversePermutation(), and mlir::isPermutationVector().
Referenced by hoistPaddingOnTensors().
|
static |
Definition at line 568 of file Utils.cpp.
References mlir::OpBuilder::create(), mlir::Value::getType(), mlir::linalg::SliceParameters::offsets, mlir::linalg::SliceParameters::sizes, and mlir::linalg::SliceParameters::strides.
Referenced by makeTiledShape(), and makeTiledShapes().
void mlir::linalg::offsetIndices | ( | OpBuilder & | b, |
LinalgOp | linalgOp, | ||
ArrayRef< OpFoldResult > | offests | ||
) |
void mlir::linalg::offsetIndices | ( | RewriterBase & | b, |
LinalgOp | linalgOp, | ||
ArrayRef< OpFoldResult > | offests | ||
) |
Definition at line 857 of file Utils.cpp.
References mlir::bindDims(), mlir::Builder::getContext(), mlir::Value::getDefiningOp(), mlir::detail::IROperandBase::getOwner(), mlir::getValueOrCreateConstantIndexOp(), mlir::affine::makeComposedFoldedAffineApply(), mlir::RewriterBase::replaceUsesWithIf(), and mlir::OpBuilder::setInsertionPointAfter().
FailureOr< PackResult > mlir::linalg::pack | ( | RewriterBase & | rewriter, |
linalg::LinalgOp | linalgOp, | ||
ArrayRef< OpFoldResult > | packedSizes | ||
) |
Implement packing of a single LinalgOp by packedSizes
.
Implement packing of a single LinalgOp by performing packing by packedSizes
.
There must be one packedSizes entry per linalgOp
iterator. Return the packed Linalg op on success, failure otherwise.
Definition at line 477 of file Transforms.cpp.
References mlir::OpBuilder::create(), DBGS, DBGSNL, mlir::getConstantIntValue(), mlir::getElementTypeOrSelf(), mlir::Operation::getRegion(), mlir::Value::getType(), mlir::ValueRange::getTypes(), mlir::Builder::getZeroAttr(), mlir::RewriterBase::notifyMatchFailure(), packLinalgMetadataOnce(), mlir::RewriterBase::replaceOp(), mlir::Region::takeBody(), and mlir::tile().
FailureOr< PackResult > mlir::linalg::packMatmulGreedily | ( | RewriterBase & | rewriter, |
LinalgOp | linalgOp, | ||
ArrayRef< OpFoldResult > | mnkPackedSizes, | ||
ArrayRef< int64_t > | mnkPaddedSizesNextMultipleOf, | ||
ArrayRef< int64_t > | mnkOrder | ||
) |
Pack a LinalgOp by greedily inferring matmul dimensions (m, n, k) where m and n are proper parallel dimensions and k is a proper reduction dimension.
Packing occurs by rewriting the op as a linalg.generic and calling linalg::pack by mnkPackedSizes
. The order of the packed dimensions is customizable: the mnkOrder
is a permutation of {0, 1, 2} to reorder {m, n, k} into one of the 8 possible forms. The outer dimensions of the operands are not permuted at this time, this is left for future work.
Definition at line 766 of file Transforms.cpp.
References mlir::computePermutationVector(), DBGS, DBGSNL, generalizeNamedOp(), inferContractionDims(), interchangeGenericOp(), mlir::isPermutationVector(), and mlir::RewriterBase::notifyMatchFailure().
Referenced by blockPackMatmul().
FailureOr< PackTransposeResult > mlir::linalg::packTranspose | ( | RewriterBase & | rewriter, |
tensor::PackOp | packOp, | ||
linalg::LinalgOp | linalgOp, | ||
tensor::UnPackOp | maybeUnPackOp, | ||
ArrayRef< int64_t > | outerPerm, | ||
ArrayRef< int64_t > | innerPerm | ||
) |
Transpose a single PackOp -> LinalgOp -> UnPackOp chain and return the transposed PackOp -> LinalgOp -> UnPackOp chain after replacements.
Return failure if either:
packOp
does not have the linalgOp
as its unique use.maybeUnPackOp
, if specified must be a consumer of the result tied to the unique packOp
use.outerPerm
(resp. innerPerm
) must be valid permutations of packOp.getOuterDimsPerm
(resp. packOp.getInnerDimsPerm
) or empty. Definition at line 675 of file Transforms.cpp.
References mlir::OpOperand::getOperandNumber(), mlir::detail::IROperandBase::getOwner(), mlir::isPermutationVector(), mlir::RewriterBase::notifyMatchFailure(), mlir::RewriterBase::replaceOp(), mlir::OpBuilder::setInsertionPoint(), and transposeOneLinalgOperandAndReplace().
Referenced by transposePackedMatmul().
FailureOr< LinalgOp > mlir::linalg::padAndHoistLinalgOp | ( | RewriterBase & | rewriter, |
LinalgOp | linalgOp, | ||
const LinalgPaddingOptions & | options | ||
) |
Apply padding and hoisting to linalgOp
according to the configuration specified in options
.
Definition at line 265 of file Padding.cpp.
References mlir::detail::enumerate(), mlir::IROperand< DerivedT, IRValueT >::get(), mlir::Value::getDefiningOp(), hoistPaddingOnTensors(), mlir::linalg::LinalgPaddingOptions::None, mlir::RewriterBase::notifyMatchFailure(), options, mlir::RewriterBase::replaceOp(), and rewriteAsPaddedOp().
SmallVector< Value > mlir::linalg::peelLoop | ( | RewriterBase & | rewriter, |
Operation * | op | ||
) |
Try to peel and canonicalize loop op
and return the new result.
Also applies affine_min/max bounds simplification on the fly where relevant.
Definition at line 59 of file Transforms.cpp.
Referenced by peelLoops().
void mlir::linalg::peelLoops | ( | RewriterBase & | rewriter, |
ArrayRef< scf::ForOp > | loops | ||
) |
Peel 'loops' and applies affine_min/max bounds simplification on the fly where relevant.
Definition at line 75 of file Transforms.cpp.
References peelLoop().
void mlir::linalg::populateBlockPackMatmulPatterns | ( | RewritePatternSet & | patterns, |
const ControlBlockPackMatmulFn & | controlFn | ||
) |
Patterns to block pack Linalg matmul ops.
Definition at line 310 of file BlockPackMatmul.cpp.
References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populateBubbleUpExtractSliceOpPatterns | ( | RewritePatternSet & | patterns | ) |
Patterns that are used to bubble up extract slice op above linalg op.
Definition at line 134 of file BubbleUpExtractSlice.cpp.
References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populateCollapseDimensions | ( | RewritePatternSet & | patterns, |
const GetCollapsableDimensionsFn & | controlCollapseDimensions | ||
) |
Pattern to collapse dimensions in a linalg.generic op.
This will collapse tensor operands when needed and expand back the result tensors.
Definition at line 2147 of file ElementwiseOpFusion.cpp.
References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populateConstantFoldLinalgOperations | ( | RewritePatternSet & | patterns, |
const ControlFusionFn & | controlFn | ||
) |
Patterns to constant fold Linalg operations.
Definition at line 306 of file ConstantFold.cpp.
References mlir::RewritePatternSet::getContext(), and mlir::RewritePatternSet::insert().
void mlir::linalg::populateContractionOpRankReducingPatterns | ( | RewritePatternSet & | patterns | ) |
Adds patterns that reduce the rank of named contraction ops that have unit dimensions in the operand(s) by converting to a sequence of collapse_shape
, <corresponding linalg named op>
, expand_shape
(if on tensors).
For example a linalg.batch_matmul
with unit batch size will convert to linalg.matmul
and a linalg.matvec
with with unit spatial dim in lhs will convert to a linalg.dot
.
Definition at line 1070 of file DropUnitDims.cpp.
References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populateConvertConv2DToImg2ColPatterns | ( | RewritePatternSet & | patterns | ) |
Populates patterns to transform linalg.conv_2d_xxx operations into linalg.generic (for img2col packing) and linalg.matmul.
Definition at line 687 of file ConvertConv2DToImg2Col.cpp.
References mlir::RewritePatternSet::getContext(), and mlir::RewritePatternSet::insert().
void mlir::linalg::populateConvertToDestinationStylePatterns | ( | RewritePatternSet & | patterns | ) |
Populate patterns that convert non-destination-style ops to destination style ops.
Definition at line 605 of file ConvertToDestinationStyle.cpp.
References mlir::RewritePatternSet::add().
void mlir::linalg::populateConvolutionVectorizationPatterns | ( | RewritePatternSet & | patterns, |
PatternBenefit | benefit = 1 |
||
) |
Populate patterns for vectorizing low-D convolution ops.
This is a step in progressive lowering for convolution ops, it assume high-D convolution ops were decomposed previously.
Definition at line 3756 of file Vectorization.cpp.
References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populateDataLayoutPropagationPatterns | ( | RewritePatternSet & | patterns, |
const ControlPropagationFn & | controlPackUnPackPropagation | ||
) |
Patterns to bubble up or down data layout ops across other operations.
Definition at line 1206 of file DataLayoutPropagation.cpp.
References mlir::RewritePatternSet::getContext(), and mlir::RewritePatternSet::insert().
void mlir::linalg::populateDecomposeConvolutionPatterns | ( | RewritePatternSet & | patterns, |
PatternBenefit | benefit = 1 |
||
) |
Linalg decompose convolutions patterns.
Populates patterns to decompose high-D convolution ops into low-D ones. This is a step in progressive lowering for convolution ops, afterwards we can vectorize the low-D convolution ops.
Definition at line 1565 of file Transforms.cpp.
References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populateDecomposeLinalgOpsPattern | ( | RewritePatternSet & | patterns, |
bool | removeDeadArgsAndResults = true |
||
) |
Populate patterns for splitting a LinalgOp
with multiple statements within its payload into multiple GenericOp
that have a single statement.
The option removeDeadArgsAndResults
adds patterns to remove dead arguments and results from the generated decomposed ops. This is default true
since the core decomposition patterns relies on these clean up patterns. It is set to false only for testing purposes.
Definition at line 384 of file DecomposeLinalgOps.cpp.
References mlir::RewritePatternSet::getContext(), mlir::RewritePatternSet::insert(), and populateEraseUnusedOperandsAndResultsPatterns().
void mlir::linalg::populateDecomposeWinogradOpsPatterns | ( | RewritePatternSet & | patterns | ) |
Patterns to decompose Winograd operators.
Definition at line 1219 of file WinogradConv2D.cpp.
References mlir::RewritePatternSet::getContext(), and mlir::RewritePatternSet::insert().
void mlir::linalg::populateElementwiseOpsFusionPatterns | ( | RewritePatternSet & | patterns, |
const ControlFusionFn & | controlElementwiseOpFusion | ||
) |
Patterns for fusing linalg operation on tensors.
Pattern to fuse linalg.generic
-> linalg.generic
operations when both operations are fusable elementwise operations.
Definition at line 2136 of file ElementwiseOpFusion.cpp.
References mlir::RewritePatternSet::add(), mlir::RewritePatternSet::getContext(), and populateEraseUnusedOperandsAndResultsPatterns().
void mlir::linalg::populateElementwiseToLinalgConversionPatterns | ( | RewritePatternSet & | patterns | ) |
Populate patterns that convert ElementwiseMappable
ops to linalg parallel loops.
Definition at line 115 of file ElementwiseToLinalg.cpp.
References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populateEraseUnnecessaryInputsPatterns | ( | RewritePatternSet & | patterns | ) |
Patterns to promote inputs to outputs and remove unused inputs of linalg.generic
ops.
Definition at line 428 of file EraseUnusedOperandsAndResults.cpp.
References mlir::RewritePatternSet::getContext(), and mlir::RewritePatternSet::insert().
void mlir::linalg::populateEraseUnusedOperandsAndResultsPatterns | ( | RewritePatternSet & | patterns | ) |
Pattern to remove dead operands and results of linalg.generic
operations.
This is effectively DCE for a linalg op.
Definition at line 421 of file EraseUnusedOperandsAndResults.cpp.
References mlir::RewritePatternSet::getContext(), and mlir::RewritePatternSet::insert().
Referenced by populateDecomposeLinalgOpsPattern(), and populateElementwiseOpsFusionPatterns().
void mlir::linalg::populateFoldReshapeOpsByCollapsingPatterns | ( | RewritePatternSet & | patterns, |
const ControlFusionFn & | controlFoldingReshapes | ||
) |
Patterns to fold an expanding tensor.expand_shape operation with its producer generic operation by collapsing the dimensions of the generic op.
Definition at line 2127 of file ElementwiseOpFusion.cpp.
References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populateFoldReshapeOpsByExpansionPatterns | ( | RewritePatternSet & | patterns, |
const ControlFusionFn & | controlFoldingReshapes | ||
) |
Patterns to fold an expanding (collapsing) tensor_reshape operation with its producer (consumer) generic operation by expanding the dimensionality of the loop in the generic op.
Definition at line 2116 of file ElementwiseOpFusion.cpp.
References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populateFoldUnitExtentDimsPatterns | ( | RewritePatternSet & | patterns, |
linalg::ControlDropUnitDims & | options | ||
) |
Patterns to fold unit-extent dimensions in operands/results of linalg ops on tensors via reassociative reshape ops.
Definition at line 801 of file DropUnitDims.cpp.
References mlir::linalg::ControlDropUnitDims::ExtractInsertSlice, options, populateFoldUnitExtentDimsViaReshapesPatterns(), and populateFoldUnitExtentDimsViaSlicesPatterns().
void mlir::linalg::populateFuseTensorPadWithProducerLinalgOpPatterns | ( | RewritePatternSet & | patterns | ) |
Pattern to fuse a tensor.pad
operation with the producer of its source, if the producer is a linalg
operation with all parallel iterator types.
Definition at line 121 of file FusePadOpWithLinalgProducer.cpp.
References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populateInlineConstantOperandsPatterns | ( | RewritePatternSet & | patterns | ) |
Patterns that are used to inline constant operands into linalg generic ops.
Definition at line 95 of file InlineScalarOperands.cpp.
References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populateLinalgGenericOpsSpecializationPatterns | ( | RewritePatternSet & | patterns | ) |
Populates patterns
with patterns to convert linalg.generic ops to named ops where possible.
A linalg.generic can represent wide range and complex computations for which equivalent linalg named op may not exist e.g. linalg.generic that takes a tensor and computes a polynomial such as: p(x) = an*x^n + ... + a1x + a0 There is no equivalent named op to convert to. Many such cases exist.
Definition at line 328 of file Specialize.cpp.
References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populateLinalgNamedOpConversionPatterns | ( | RewritePatternSet & | patterns | ) |
Patterns to convert from one named op to another.
These can be seen as canonicalizations of named ops into another named op.
Definition at line 161 of file NamedOpConversions.cpp.
References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populateLinalgNamedOpsGeneralizationPatterns | ( | RewritePatternSet & | patterns | ) |
Linalg generalization patterns.
Populates patterns
with patterns to convert spec-generated named ops to linalg.generic ops.
Definition at line 95 of file Generalization.cpp.
References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populateLinalgTilingCanonicalizationPatterns | ( | RewritePatternSet & | patterns | ) |
Definition at line 864 of file Tiling.cpp.
References mlir::RewritePatternSet::getContext().
Referenced by getLinalgTilingCanonicalizationPatterns().
void mlir::linalg::populateLinalgToStandardConversionPatterns | ( | RewritePatternSet & | patterns | ) |
Populate the given list with patterns that convert from Linalg to Standard.
Definition at line 127 of file LinalgToStandard.cpp.
References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populateMoveInitOperandsToInputPattern | ( | RewritePatternSet & | patterns | ) |
A pattern that converts init operands to input operands.
Definition at line 813 of file DropUnitDims.cpp.
References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populatePadOpVectorizationPatterns | ( | RewritePatternSet & | patterns, |
PatternBenefit | baseBenefit = 1 |
||
) |
Populates patterns
with patterns that vectorize tensor.pad.
These patterns are meant to apply in a complementary fashion. Benefits are used to encode a certain ordering of pattern application. To avoid scattering magic constants throughout the code base, the patterns must be added with this function. baseBenefit
can be used to offset the benefit of all tensor::PadOp vectorization patterns by a certain value.
Definition at line 2659 of file Vectorization.cpp.
References mlir::RewritePatternSet::add(), mlir::PatternBenefit::getBenefit(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populateSparseTensorRewriting | ( | RewritePatternSet & | patterns | ) |
Populate patterns that are only useful in the context of sparse tensors.
void mlir::linalg::populateSplitReductionPattern | ( | RewritePatternSet & | patterns, |
const ControlSplitReductionFn & | controlSplitReductionFn, | ||
bool | useAlloc = false |
||
) |
Patterns to apply splitReduction
below.
Definition at line 448 of file SplitReduction.cpp.
References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populateSwapExtractSliceWithFillPatterns | ( | RewritePatternSet & | patterns | ) |
Adds patterns that waps tensor.extract_slice(linalg.fill(cst, init)) into linalg.fill(cst, tensor.extract_slice(init)).
Definition at line 38 of file SwapExtractSliceWithFillPatterns.cpp.
References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populateTranposeConv2DPatterns | ( | RewritePatternSet & | patterns | ) |
Definition at line 141 of file TransposeConv2D.cpp.
References mlir::RewritePatternSet::getContext(), and mlir::RewritePatternSet::insert().
void mlir::linalg::populateTransposeMatmulPatterns | ( | RewritePatternSet & | patterns, |
bool | transposeLHS = true |
||
) |
Patterns to convert Linalg matmul ops to transposed variants.
Definition at line 157 of file TransposeMatmul.cpp.
References mlir::RewritePatternSet::add(), and mlir::RewritePatternSet::getContext().
void mlir::linalg::populateWinogradConv2DPatterns | ( | RewritePatternSet & | patterns, |
int64_t | m, | ||
int64_t | r | ||
) |
Patterns to apply Winograd Conv2D algorithm F(m x m, r x r).
Definition at line 1212 of file WinogradConv2D.cpp.
FailureOr< PromotionInfo > mlir::linalg::promoteSubviewAsNewBuffer | ( | OpBuilder & | b, |
Location | loc, | ||
memref::SubViewOp | subView, | ||
const AllocBufferCallbackFn & | allocationFn, | ||
DataLayout & | layout | ||
) |
Definition at line 238 of file Promotion.cpp.
References mlir::OpBuilder::create(), mlir::OpBuilder::createOrFold(), mlir::detail::enumerate(), mlir::Builder::getIndexAttr(), and mlir::getValueOrCreateConstantIndexOp().
Referenced by promoteSubViews().
FailureOr< LinalgOp > mlir::linalg::promoteSubViews | ( | OpBuilder & | b, |
LinalgOp | op, | ||
const LinalgPromotionOptions & | options | ||
) |
Promote the subViews
into a new buffer allocated at the insertion point b
.
Promotion occurs in 3 steps:
Return the modified linalg op (the modification happens in place) as well as all the copy ops created.
Definition at line 421 of file Promotion.cpp.
References options, and promoteSubViews().
LogicalResult mlir::linalg::promoteSubviewsPrecondition | ( | Operation * | op, |
LinalgPromotionOptions | options | ||
) |
Promote memref.subviews feeding linalg-on-buffers operations.
Definition at line 399 of file Promotion.cpp.
|
static |
Variadic helper function.
Definition at line 344 of file MeshShardingInterfaceImpl.cpp.
Referenced by registerMeshShardingInterfaceExternalModels().
void mlir::linalg::registerAllDialectInterfaceImplementations | ( | DialectRegistry & | registry | ) |
Definition at line 17 of file AllInterfaces.cpp.
References mlir::arith::registerBufferizableOpInterfaceExternalModels(), registerMeshShardingInterfaceExternalModels(), registerSubsetOpInterfaceExternalModels(), registerTilingInterfaceExternalModels(), and mlir::affine::registerValueBoundsOpInterfaceExternalModels().
Referenced by mlir::registerAllDialects().
void mlir::linalg::registerBufferizableOpInterfaceExternalModels | ( | DialectRegistry & | registry | ) |
Definition at line 196 of file BufferizableOpInterfaceImpl.cpp.
References mlir::DialectRegistry::addExtension().
void mlir::linalg::registerMeshShardingInterfaceExternalModels | ( | DialectRegistry & | registry | ) |
Definition at line 348 of file MeshShardingInterfaceImpl.cpp.
References mlir::DialectRegistry::addExtension(), mlir::MLIRContext::appendDialectRegistry(), mlir::DialectRegistry::getDialectNames(), mlir::MLIRContext::getOrLoadDialect(), mlir::DialectRegistry::insert(), and registerAll().
Referenced by registerAllDialectInterfaceImplementations().
|
static |
Definition at line 338 of file MeshShardingInterfaceImpl.cpp.
void mlir::linalg::registerRuntimeVerifiableOpInterfaceExternalModels | ( | DialectRegistry & | registry | ) |
Definition at line 122 of file RuntimeOpVerification.cpp.
References mlir::DialectRegistry::addExtension(), and mlir::MLIRContext::loadDialect().
Referenced by mlir::registerAllDialects().
void mlir::linalg::registerSubsetOpInterfaceExternalModels | ( | DialectRegistry & | registry | ) |
Definition at line 75 of file SubsetInsertionOpInterfaceImpl.cpp.
References mlir::DialectRegistry::addExtension().
Referenced by registerAllDialectInterfaceImplementations().
void mlir::linalg::registerTilingInterfaceExternalModels | ( | DialectRegistry & | registry | ) |
Definition at line 522 of file TilingInterfaceImpl.cpp.
References mlir::DialectRegistry::addExtension(), and registerAll().
Referenced by registerAllDialectInterfaceImplementations().
void mlir::nvgpu::registerTransformDialectExtension | ( | DialectRegistry & | registry | ) |
Definition at line 60 of file DialectExtension.cpp.
References mlir::DialectRegistry::addExtensions().
Referenced by mlir::registerAllExtensions().
void mlir::linalg::registerValueBoundsOpInterfaceExternalModels | ( | DialectRegistry & | registry | ) |
Definition at line 54 of file ValueBoundsOpInterfaceImpl.cpp.
References mlir::DialectRegistry::addExtension().
LogicalResult mlir::linalg::rewriteAsPaddedOp | ( | RewriterBase & | rewriter, |
LinalgOp | opToPad, | ||
const LinalgPaddingOptions & | options, | ||
LinalgOp & | paddedOp, | ||
SmallVector< Value > & | replacements, | ||
SmallVector< tensor::PadOp > & | padOps | ||
) |
Pad the iterator dimensions paddingDimensions
of all opToPad
operands to a static bounding box.
The original opToPad
is cloned and operates on the padded tensors.
replacements
.padOps
.Definition at line 153 of file Padding.cpp.
References mlir::clone(), mlir::OpBuilder::create(), DBGS, mlir::detail::enumerate(), mlir::get(), mlir::getElementTypeOrSelf(), mlir::Builder::getIndexAttr(), mlir::Value::getType(), mlir::ValueRange::getTypes(), mlir::Builder::getZeroAttr(), mlir::linalg::LinalgPaddingOptions::LinalgCopy, mlir::linalg::LinalgPaddingOptions::None, mlir::RewriterBase::notifyMatchFailure(), options, padOperandToSmallestStaticBoundingBox(), mlir::reifyResultShapes(), and mlir::OpBuilder::setInsertionPointAfter().
Referenced by padAndHoistLinalgOp().
FailureOr< Operation * > mlir::linalg::rewriteInDestinationPassingStyle | ( | RewriterBase & | rewriter, |
tensor::FromElementsOp | fromElementsOp | ||
) |
Rewrite tensor.from_elements to linalg.generic.
Lower tensor.from_elements to a sequence of chained tensor.insert.
Definition at line 345 of file ConvertToDestinationStyle.cpp.
References mlir::OpBuilder::create(), createInserts(), mlir::Value::getDefiningOp(), mlir::RewriterBase::replaceOp(), and mlir::RewriterBase::replaceOpWithNewOp().
FailureOr< Operation * > mlir::linalg::rewriteInDestinationPassingStyle | ( | RewriterBase & | rewriter, |
tensor::GenerateOp | generateOp | ||
) |
Rewrite tensor.generate to linalg.generic.
Lower tensor.generate to linalg.generic.
Definition at line 383 of file ConvertToDestinationStyle.cpp.
References mlir::OpBuilder::create(), mlir::OpBuilder::createBlock(), mlir::Builder::getMultiDimIdentityMap(), mlir::RewriterBase::mergeBlocks(), mlir::RewriterBase::replaceOp(), mlir::RewriterBase::replaceOpWithNewOp(), and mlir::OpBuilder::setInsertionPointToStart().
FailureOr< Operation * > mlir::linalg::rewriteInDestinationPassingStyle | ( | RewriterBase & | rewriter, |
tensor::PadOp | padOp | ||
) |
Rewrite tensor.pad to linalg.generic + tensor.insert_slice.
Lower tensor.pad to linalg.generic + tensor.insert_slice.
Definition at line 424 of file ConvertToDestinationStyle.cpp.
References mlir::OpBuilder::create(), mlir::Builder::getIndexAttr(), mlir::tensor::getMixedSizes(), mlir::isZeroIndex(), movePaddingToFillOrGenericOp(), mlir::RewriterBase::notifyMatchFailure(), mlir::reifyResultShapes(), mlir::RewriterBase::replaceOpWithNewOp(), and mlir::OpBuilder::setInsertionPointAfter().
FailureOr< std::pair< Operation *, Operation * > > mlir::linalg::rewriteInIm2Col | ( | RewriterBase & | rewriter, |
linalg::Conv2DNchwFchwOp | convOp | ||
) |
Similar to rewriteInIm2Col with linalg::Conv2DNhwcHwcfOp except because the channels are to the left of the image shape dimensions, the position of the contraction dimension in the resulting matmul is reversed.
This swaps the LHS and RHS of the matmul when compared with nhwc (i.e. (D, C x Kh x Kw) * (C x Kh x Kw, Ho x Wo))
Definition at line 365 of file ConvertConv2DToImg2Col.cpp.
References mlir::bindDims(), mlir::OpBuilder::create(), createAdd(), createMul(), mlir::AffineMap::get(), mlir::get(), mlir::Builder::getContext(), getConvolvedIndex(), mlir::Value::getLoc(), mlir::AffineMap::getMultiDimIdentityMap(), mlir::Value::getType(), mlir::getType(), hasAllOneValues(), mlir::RewriterBase::notifyMatchFailure(), mlir::RewriterBase::replaceOp(), and unrollIndex().
FailureOr< std::pair< Operation *, Operation * > > mlir::linalg::rewriteInIm2Col | ( | RewriterBase & | rewriter, |
linalg::Conv2DNhwcFhwcOp | convOp | ||
) |
Same as the above but for Fhwc channel orderings in the filter.
In this case the matrix multiplication is actually a row-wise dot-product rather than a row-column dot-product. This is to avoid transposing the filter matrix which would be required for a regular matrix multiplication to produce the correct output dimensions.
Definition at line 498 of file ConvertConv2DToImg2Col.cpp.
References mlir::bindDims(), mlir::OpBuilder::create(), createAdd(), createMul(), mlir::AffineMap::get(), mlir::get(), mlir::Builder::getContext(), getConvolvedIndex(), mlir::AffineMap::getMultiDimIdentityMap(), mlir::Value::getType(), mlir::getType(), hasAllOneValues(), mlir::RewriterBase::notifyMatchFailure(), mlir::RewriterBase::replaceOp(), and unrollIndex().
FailureOr< std::pair< Operation *, Operation * > > mlir::linalg::rewriteInIm2Col | ( | RewriterBase & | rewriter, |
linalg::Conv2DNhwcHwcfOp | convOp | ||
) |
Convert linalg.conv_2d_nhwc_hwcf into linalg.generic (for img2col packing) and linalg.matmul.
A convolution operation can be written as a matrix-matrix multiplication by unfolding the cross-correlation between input and filter and explicitly copy overlapped sliding window inputs.
Consider 2D input X with single channel input and output and 2x2 filter W: [x(0, 0) , x(0, 1) , ..., x(0, n) ] [x(1, 0) , x(1, 1) , ..., x(1, n) ] [. , . ,. , . ] [w(0, 0), w(0, 1)] [. , . , . , . ] (conv) [w(1, 0), w(1, 1)] [. , . , ., . ] [x(n-1, 0), x(n-1, 1), ..., x(n-1, n-1)]
The packed input data (img2col) is a matrix with |rows| = output spatial size, |columns| = filter spatial size. To compute the output Y(i, j) we need to calculate the dot product between filter window at input X(x, y)) and the filter which will look like the following where r.h.s is the img2col matrix and l.h.s is the flattened filter:
[x(0,0), x(0,1), x(1,0), x(1,1)] [x(0,1), x(1,1), x(0,2), x(1,2)] (matmul) [w(0,0), w(0,1), w(1,0), w(1,1)] [x(0,1), x(1,1), x(0,2), x(1,2)] [ . , . , . , . ]
In general for 2D case with (N, H, W, C) input and (Kh, Kw, C, D) filter and output (N, Ho, Wo, D) the convolution is the following matrix-matrix multiplication (Ho x Wo, Kh x Kw x C) * (Kh x Kw x C, D) for each input in the N input. For the case where N > 1 its a batched matrix-matrix multiplication.
On success, return both the operation that produces the img2col tensor and the final operation of the sequence that replaces the original convolution.
Definition at line 79 of file ConvertConv2DToImg2Col.cpp.
References mlir::bindDims(), mlir::OpBuilder::create(), createAdd(), createMul(), mlir::AffineMap::get(), mlir::get(), mlir::Builder::getContext(), getConvolvedIndex(), mlir::AffineMap::getMultiDimIdentityMap(), mlir::Value::getType(), mlir::getType(), hasAllOneValues(), mlir::RewriterBase::notifyMatchFailure(), mlir::RewriterBase::replaceOp(), and unrollIndex().
FailureOr< std::pair< Operation *, Operation * > > mlir::linalg::rewriteInIm2Col | ( | RewriterBase & | rewriter, |
linalg::DepthwiseConv2DNhwcHwcOp | convOp | ||
) |
Similar to rewriteInIm2Col with linalg::Conv2DNhwcHwcfOp except there is no reduction among the input channels so each convolution can be a matrix-vector product and by transposing both input filter so channels are outer most the computation is a batched matrix-vector product.
Definition at line 214 of file ConvertConv2DToImg2Col.cpp.
References mlir::bindDims(), mlir::OpBuilder::create(), mlir::AffineMap::get(), mlir::get(), mlir::Builder::getAffineConstantExpr(), mlir::Builder::getAffineDimExpr(), mlir::Builder::getContext(), mlir::AffineMap::getMultiDimIdentityMap(), mlir::Operation::getResult(), mlir::Value::getType(), hasAllOneValues(), mlir::inversePermutation(), mlir::RewriterBase::notifyMatchFailure(), and mlir::RewriterBase::replaceOp().
FailureOr< LinalgOp > mlir::linalg::specializeGenericOp | ( | RewriterBase & | rewriter, |
GenericOp | genericOp | ||
) |
Create a namedOp from the given GenericOp and replace the GenericOp.
Currently we can specialize only trivial linalg copy operations.
Definition at line 260 of file Specialize.cpp.
References isaCopyOpInterface(), isaElemwiseSingleUnaryOpInterface(), isaFillOpInterface(), and mlir::RewriterBase::replaceOpWithNewOp().
std::pair< TilingInterface, TilingInterface > mlir::linalg::splitOp | ( | RewriterBase & | rewriter, |
TilingInterface | op, | ||
unsigned | dimension, | ||
OpFoldResult | splitPoint | ||
) |
Split the given op
into two parts along the given iteration space dimension
at the specified splitPoint
, and return the two parts.
If the second part is statically known to be empty, do not create it and return nullptr instead. Error state is signalled by returning a pair of nullptrs.
For example, the following op:
linalg.matmul ins(%0, %1 : tensor<128x32xf32>, tensor<32x64xf32>) outs(%2 : tensor<128x64xf32>)
split along the first dimension at position 42 will result in:
%3 = tensor.extract_slice %0[0, 0][42, 32][1, 1] %4 = tensor.extract_slice %2[0, 0][42, 64][1, 1] %5 = linalg.matmul ins(%3, %1 : tensor<42x32xf32>, tensor<32x64xf32>) outs(%5 : tensor<42x64xf32>) %6 = tensor.insert_slice %5 into %2[0, 0][42, 64][1, 1]
%7 = tensor.extract_slice %0[42, 0][86, 32][1, 1] %8 = tensor.extract_slice %6[42, 0][86, 64][1, 1] %9 = linalg.matmul ins(%7, %1 : tensor<86x32xf32>, tensor<32x64xf32>) outs(%8 : tensor<86x64xf32>) tensor.insert_slice %5 into %6[42, 0][86, 64][1, 1]
Note that there is no simplification other than constant propagation applied to slice extraction and insertion.
FailureOr< SplitReductionResult > mlir::linalg::splitReduction | ( | RewriterBase & | b, |
LinalgOp | op, | ||
const ControlSplitReductionFn & | controlSplitReductionFn, | ||
bool | useAlloc = false |
||
) |
Definition at line 30 of file SplitReduction.cpp.
FailureOr< SplitReductionResult > mlir::linalg::splitReductionByScaling | ( | RewriterBase & | b, |
LinalgOp | op, | ||
const ControlSplitReductionFn & | controlSplitReductionFn, | ||
bool | useAlloc = false |
||
) |
Scaling-based implementation of the split reduction transformation.
Core rewrite implementation.
Instead of introducing an ExpandShapeOp, this rewrites a reduction dimension k
into k * scale + kk
.
Example: ``` %0 = linalg.matmul ins(A, B: tensor<16x256xf32>, tensor<256x32xf32>) outs(C: tensor<16x32xf32>) -> tensor<16x32xf32> ```
Is transformed to:
``` #map0 = affine_map<(d0, d1, d2, d3) -> (d0, d2 * 4 + d3)> #map1 = affine_map<(d0, d1, d2, d3) -> (d2 * 4 + d3, d1)> #map2 = affine_map<(d0, d1, d2, d3) -> (d2, d3)> #map3 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2)> #map4 = affine_map<(d0, d1, d2) -> (d0, d1, d2)> #map5 = affine_map<(d0, d1, d2) -> (d0, d1)> %0 = tensor.empty [16, 32, 64] : tensor<16x32x64xf32> cst = arith.constant 0.000000e+00 : f32 %1 = linalg.fill ins(cst : f32) outs(%0 : tensor<16x32x64xf32>) -> tensor<16x32x64xf32> %2 = tensor.empty [64, 4] : tensor<64x4xi1>
%3 = linalg.generic {indexing_maps = [#map0, #map1, #map2, #map3], iterator_types = ["parallel", "parallel", "parallel", "reduction"]} ins(A, B, %2 : tensor<16x256xf32>, tensor<256x32xf32>, tensor<64x4xi1>) outs(%1 : tensor<16x32x64xf32>) { ^bb0(arg3: f32, arg4: f32, arg5: i1, arg6: f32): %5 = arith.mulf arg3, arg4 : f32 %6 = arith.addf arg6, %5 : f32 linalg.yield %6 : f32 } -> tensor<16x32x64xf32>
%4 = linalg.generic {indexing_maps = [#map4, #map5], iterator_types = ["parallel", "parallel", "reduction"]} */ // ins(%3 : tensor<16x32x64xf32>) /** outs(C : tensor<16x32xf32>) { ^bb0(arg3: f32, arg4: f32): %5 = arith.addf arg3, arg4 : f32 linalg.yield %5 : f32 } -> tensor<16x32xf32>
return %4 : tensor<16x32xf32> ```
Definition at line 241 of file SplitReduction.cpp.
|
static |
Definition at line 219 of file MeshShardingInterfaceImpl.cpp.
FailureOr< TiledLinalgOp > mlir::linalg::tileLinalgOp | ( | RewriterBase & | b, |
LinalgOp | op, | ||
const LinalgTilingOptions & | options | ||
) |
Definition at line 824 of file Tiling.cpp.
FailureOr< linalg::ForallReductionTilingResult > mlir::linalg::tileReductionUsingForall | ( | RewriterBase & | b, |
PartialReductionOpInterface | op, | ||
ArrayRef< OpFoldResult > | numThreads, | ||
ArrayRef< OpFoldResult > | tileSizes = {} , |
||
std::optional< ArrayAttr > | mapping = std::nullopt |
||
) |
Method to tile a reduction to parallel iterations computing partial reductions.
After the loop all the partial reduction are merged into a final reduction. For example for the following sequence
into:
Definition at line 596 of file Tiling.cpp.
void mlir::linalg::transformIndexOps | ( | RewriterBase & | b, |
LinalgOp | op, | ||
SmallVectorImpl< Value > & | ivs, | ||
const LoopIndexToRangeIndexMap & | loopIndexToRangeIndex | ||
) |
All indices returned by IndexOp should be invariant with respect to tiling.
Therefore, if an operation is tiled, we have to transform the indices accordingly, i.e. offset them by the values of the corresponding induction variables that are captured implicitly in the body of the op.
Example. linalg.generic
before tiling:
#id_2d = (i, j) -> (i, j) #pointwise_2d_trait = { indexing_maps = [#id_2d, #id_2d], iterator_types = ["parallel", "parallel"] } linalg.generic #pointwise_2d_trait operand, result { ^bb0(operand_in: f32, result_in: f32): i = linalg.index 0 : index j = linalg.index 1 : index <some operations that use i, j> }: memref<50x100xf32>, memref<50x100xf32>
After tiling pass with tiles sizes 10 and 25:
#strided = (i, j)[s0, s1, s2] -> (i * s1 + s0 + j * s2)
c1 = arith.constant 1 : index c0 = arith.constant 0 : index c25 = arith.constant 25 : index c10 = arith.constant 10 : index operand_dim_0 = dim operand, 0 : memref<50x100xf32> operand_dim_1 = dim operand, 1 : memref<50x100xf32> scf.for k = c0 to operand_dim_0 step c10 { scf.for l = c0 to operand_dim_1 step c25 { %4 = memref.subview operand[k, l][c10, c25][c1, c1] : memref<50x100xf32> to memref<?x?xf32, #strided> %5 = memref.subview result[k, l][c10, c25][c1, c1] : memref<50x100xf32> to memref<?x?xf32, #strided> linalg.generic pointwise_2d_trait %4, %5 { ^bb0(operand_in: f32, result_in: f32): i = linalg.index 0 : index j = linalg.index 1 : index // Indices k
and l
are implicitly captured in the body. transformed_i = arith.addi i, k : index // index i
is offset by k transformed_j = arith.addi j, l : index // index j
is offset by l // Every use of i, j is replaced with transformed_i, transformed_j <some operations that use transformed_i, transformed_j> }: memref<?x?xf32, #strided>, memref<?x?xf32, #strided> } }
TODO: Investigate whether mixing implicit and explicit indices does not lead to losing information.
Definition at line 78 of file Tiling.cpp.
FailureOr< Operation * > mlir::linalg::transposeBatchMatmul | ( | RewriterBase & | rewriter, |
linalg::BatchMatmulOp | batchMatmulOp, | ||
bool | transposeLHS = true |
||
) |
Pattern to replace.
linalg.batch_matmul(a, b)
with
linalg.batch_matmul_transpose_a(linalg.transpose(a), b)
Only the non-batch dimensions are transposed. By default the LHS is transposed. Set transposeLHS=false
to transpose RHS instead.
Definition at line 81 of file TransposeMatmul.cpp.
References mlir::OpBuilder::create(), mlir::Value::getType(), mlir::bufferization::hasTensorSemantics(), mlir::RewriterBase::notifyMatchFailure(), and mlir::RewriterBase::replaceOp().
FailureOr< Operation * > mlir::linalg::transposeConv2D | ( | RewriterBase & | rewriter, |
linalg::Conv2DNhwcFhwcOp | op | ||
) |
Convert linalg.conv_2d_nhwc_fhwc(_q) to linalg.conv_2d_nhwc_hwcf(_q) by materializing transpose.
Definition at line 127 of file TransposeConv2D.cpp.
FailureOr< Operation * > mlir::linalg::transposeConv2D | ( | RewriterBase & | rewriter, |
linalg::Conv2DNhwcFhwcQOp | op | ||
) |
Definition at line 134 of file TransposeConv2D.cpp.
FailureOr< Operation * > mlir::linalg::transposeMatmul | ( | RewriterBase & | rewriter, |
linalg::MatmulOp | matmulOp, | ||
bool | transposeLHS = true |
||
) |
Convert Linalg matmul ops to transposed variants.
Pattern to replace.
linalg.matmul(a, b)
with
linalg.matmul_transpose_a(linalg.transpose(a), b)
By default the LHS is transposed. Set transposeLHS=false
to transpose RHS instead.
Definition at line 31 of file TransposeMatmul.cpp.
References mlir::OpBuilder::create(), mlir::Value::getType(), mlir::bufferization::hasTensorSemantics(), mlir::RewriterBase::notifyMatchFailure(), and mlir::RewriterBase::replaceOp().
|
static |
Definition at line 55 of file ConvertConv2DToImg2Col.cpp.
References mlir::OpBuilder::create(), mlir::affine::delinearizeIndex(), and mlir::Builder::getIndexAttr().
Referenced by rewriteInIm2Col().
void mlir::linalg::updateBoundsForCyclicDistribution | ( | OpBuilder & | builder, |
Location | loc, | ||
Value | procId, | ||
Value | nprocs, | ||
Value & | lb, | ||
Value & | ub, | ||
Value & | step | ||
) |
Update the lb
, ub
and step
to get per processor lb
, ub
and step
.
Definition at line 386 of file Utils.cpp.
References mlir::bindDims(), mlir::getAffineSymbolExpr(), mlir::Builder::getContext(), and mlir::affine::makeComposedAffineApply().
Referenced by mlir::linalg::GenerateLoopNest< LoopTy >::doit().
LogicalResult mlir::linalg::vectorize | ( | RewriterBase & | rewriter, |
Operation * | op, | ||
ArrayRef< int64_t > | inputVectorSizes = {} , |
||
ArrayRef< bool > | inputScalableVecDims = {} , |
||
bool | vectorizeNDExtract = false , |
||
bool | flatten1DDepthwiseConv = false |
||
) |
Emit a suitable vector form for an operation.
If provided, inputVectorSizes
are used to vectorize this operation. inputVectorSizes
must match the rank of the iteration space of the operation and the sizes must be smaller or equal than their counterpart interation space sizes, if static. inputVectorShapes
also allows the vectorization of operations with dynamic shapes.
If provided, inputVectorSizes
are used to vectorize this operation. inputVectorSizes
must match the rank of the iteration space of the operation and the input vector sizes must be greater than or equal to their counterpart iteration space sizes, if static. inputVectorShapes
also allows the vectorization of operations with dynamic shapes.
Definition at line 2103 of file Vectorization.cpp.
LogicalResult mlir::linalg::vectorizeCopy | ( | RewriterBase & | builder, |
memref::CopyOp | copyOp | ||
) |
Emit a suitable vector form for a Copy op with fully static shape.
Definition at line 2193 of file Vectorization.cpp.
References mlir::OpBuilder::create(), mlir::get(), mlir::getElementTypeOrSelf(), mlir::Builder::getMultiDimIdentityMap(), mlir::Operation::getResults(), mlir::sparse_tensor::detail::readValue(), and mlir::RewriterBase::replaceOp().
Referenced by mlir::linalg::CopyVectorizationPattern::matchAndRewrite().
LogicalResult mlir::linalg::vectorizeOpPrecondition | ( | Operation * | op, |
ArrayRef< int64_t > | inputVectorSizes = {} , |
||
ArrayRef< bool > | inputScalableVecDims = {} , |
||
bool | vectorizeNDExtract = false , |
||
bool | flatten1DDepthwiseConv = false |
||
) |
Return success if the operation can be vectorized.
Definition at line 2056 of file Vectorization.cpp.
FailureOr< Operation * > mlir::linalg::winogradConv2D | ( | RewriterBase & | rewriter, |
linalg::Conv2DNhwcFhwcOp | op, | ||
int64_t | m, | ||
int64_t | r | ||
) |
Convert linalg.conv_2d_nhwc_fhwc to Winograd Conv2D algorithm F(m x m, r x r).
m is the dimension size of output and r is the dimension size of filter.
Definition at line 1188 of file WinogradConv2D.cpp.