|
MLIR
22.0.0git
|
Helper to create the tma operations corresponding to linalg::CopyOp.
More...
Inheritance diagram for CopyBuilder:Public Member Functions | |
| CopyBuilder (RewriterBase &rewriter, Location loc) | |
| SmallVector< Operation * > | rewrite (ArrayRef< Operation * > copyOps) |
Public Member Functions inherited from HopperBuilder | |
| HopperBuilder (RewriterBase &rewriter, Location loc) | |
| TypedValue< MBarrierGroupType > | buildAndInitBarrierInSharedMemory (OpFoldResult numThreads) |
| TypedValue< TensorMapDescriptorType > | buildGlobalMemRefDescriptor (TypedValue< MemRefType > memref, gpu::LaunchOp launchOp) |
| Create tma descriptor op to initiate transfer from global to shared memory. More... | |
| OpFoldResult | buildTmaAsyncLoad (TypedValue< TensorMapDescriptorType > globalDesc, TypedValue< MemRefType > sharedMemref, TypedValue< MBarrierGroupType > barrier, SmallVectorImpl< Operation * > &loadOps) |
Build a tma load from global memory to shared memory using barrier to synchronize. More... | |
| void | buildBarrierArriveTx (TypedValue< MBarrierGroupType > barrier, ArrayRef< OpFoldResult > sizes) |
| SmallVector< Operation * > | buildPredicateLoadsOnThread0 (ArrayRef< TypedValue< TensorMapDescriptorType >> globalDescriptors, ArrayRef< TypedValue< MemRefType >> sharedMemBuffers, TypedValue< MBarrierGroupType > barrier) |
| If threadIdx.x == 0 does TMA request + wait, else just wait. More... | |
| void | buildTryWaitParity (TypedValue< MBarrierGroupType > barrier) |
Additional Inherited Members | |
Public Attributes inherited from HopperBuilder | |
| RewriterBase & | rewriter |
| Location | loc |
Helper to create the tma operations corresponding to linalg::CopyOp.
Definition at line 1027 of file NVGPUTransformOps.cpp.
|
inline |
Definition at line 1028 of file NVGPUTransformOps.cpp.
| SmallVector< Operation * > CopyBuilder::rewrite | ( | ArrayRef< Operation * > | copyOps | ) |
Definition at line 1034 of file NVGPUTransformOps.cpp.
References mlir::bindSymbols(), mlir::computeProduct(), mlir::RewriterBase::eraseOp(), mlir::Builder::getContext(), mlir::affine::makeComposedFoldedAffineApply(), and mlir::OpBuilder::setInsertionPoint().