MLIR 22.0.0git
HopperBuilder Struct Reference

Helper to create the base Hopper-specific operations that are reused in various other places. More...

Inheritance diagram for HopperBuilder:

Public Member Functions

 HopperBuilder (RewriterBase &rewriter, Location loc)
TypedValue< MBarrierGroupType > buildAndInitBarrierInSharedMemory (OpFoldResult numThreads)
TypedValue< TensorMapDescriptorType > buildGlobalMemRefDescriptor (TypedValue< MemRefType > memref, gpu::LaunchOp launchOp)
 Create tma descriptor op to initiate transfer from global to shared memory.
OpFoldResult buildTmaAsyncLoad (TypedValue< TensorMapDescriptorType > globalDesc, TypedValue< MemRefType > sharedMemref, TypedValue< MBarrierGroupType > barrier, SmallVectorImpl< Operation * > &loadOps)
 Build a tma load from global memory to shared memory using barrier to synchronize.
void buildBarrierArriveTx (TypedValue< MBarrierGroupType > barrier, ArrayRef< OpFoldResult > sizes)
SmallVector< Operation * > buildPredicateLoadsOnThread0 (ArrayRef< TypedValue< TensorMapDescriptorType > > globalDescriptors, ArrayRef< TypedValue< MemRefType > > sharedMemBuffers, TypedValue< MBarrierGroupType > barrier)
 If threadIdx.x == 0 does TMA request + wait, else just wait.
void buildTryWaitParity (TypedValue< MBarrierGroupType > barrier)

Public Attributes

RewriterBaserewriter
Location loc

Detailed Description

Helper to create the base Hopper-specific operations that are reused in various other places.

Definition at line 836 of file NVGPUTransformOps.cpp.

Constructor & Destructor Documentation

◆ HopperBuilder()

HopperBuilder::HopperBuilder ( RewriterBase & rewriter,
Location loc )
inline

Definition at line 837 of file NVGPUTransformOps.cpp.

References loc, and rewriter.

Referenced by CopyBuilder::CopyBuilder().

Member Function Documentation

◆ buildAndInitBarrierInSharedMemory()

TypedValue< MBarrierGroupType > HopperBuilder::buildAndInitBarrierInSharedMemory ( OpFoldResult numThreads)

◆ buildBarrierArriveTx()

◆ buildGlobalMemRefDescriptor()

TypedValue< TensorMapDescriptorType > HopperBuilder::buildGlobalMemRefDescriptor ( TypedValue< MemRefType > memref,
gpu::LaunchOp launchOp )

Create tma descriptor op to initiate transfer from global to shared memory.

This must be done before the launch op, on the host.

Definition at line 932 of file NVGPUTransformOps.cpp.

References mlir::memref::getMixedSizes(), getSharedAddressSpaceAttribute(), mlir::getValueOrCreateConstantIndexOp(), loc, rewriter, and mlir::MemRefType::Builder::setMemorySpace().

Referenced by CopyBuilder::rewrite().

◆ buildPredicateLoadsOnThread0()

SmallVector< Operation * > HopperBuilder::buildPredicateLoadsOnThread0 ( ArrayRef< TypedValue< TensorMapDescriptorType > > globalDescriptors,
ArrayRef< TypedValue< MemRefType > > sharedMemBuffers,
TypedValue< MBarrierGroupType > barrier )

If threadIdx.x == 0 does TMA request + wait, else just wait.

Return the operation that performs the transfer on thread0.

Definition at line 872 of file NVGPUTransformOps.cpp.

References buildBarrierArriveTx(), buildTmaAsyncLoad(), mlir::arith::ConstantIndexOp::create(), mlir::getAsIndexOpFoldResult(), loc, and rewriter.

Referenced by CopyBuilder::rewrite().

◆ buildTmaAsyncLoad()

OpFoldResult HopperBuilder::buildTmaAsyncLoad ( TypedValue< TensorMapDescriptorType > globalDesc,
TypedValue< MemRefType > sharedMemref,
TypedValue< MBarrierGroupType > barrier,
SmallVectorImpl< Operation * > & loadOps )

Build a tma load from global memory to shared memory using barrier to synchronize.

Return the number of bytes that will be transferred.

Definition at line 961 of file NVGPUTransformOps.cpp.

References mlir::bindSymbolsList(), mlir::computeProduct(), mlir::arith::ConstantIndexOp::create(), mlir::memref::getMixedSizes(), loc, mlir::affine::makeComposedFoldedAffineApply(), and rewriter.

Referenced by buildPredicateLoadsOnThread0().

◆ buildTryWaitParity()

void HopperBuilder::buildTryWaitParity ( TypedValue< MBarrierGroupType > barrier)

Definition at line 997 of file NVGPUTransformOps.cpp.

References mlir::arith::ConstantIndexOp::create(), loc, and rewriter.

Referenced by CopyBuilder::rewrite().

Member Data Documentation

◆ loc

◆ rewriter


The documentation for this struct was generated from the following file: