MLIR
20.0.0git
|
#include "mlir/Dialect/NVGPU/Utils/MMAUtils.h"
#include "mlir/Dialect/Affine/IR/AffineOps.h"
#include "mlir/Dialect/Arith/IR/Arith.h"
#include "mlir/Dialect/LLVMIR/NVVMDialect.h"
#include "mlir/Dialect/NVGPU/IR/NVGPUDialect.h"
#include "mlir/Dialect/Vector/IR/VectorOps.h"
Go to the source code of this file.
Functions | |
static bool | isAccumulatorOrResult (MatMulOperandRole operandType) |
static int64_t | inferNumRegistersPerMatrixFragment (const WarpMatrixInfo &type) |
Returns the number of registers which compose a matrix fragment held by a single thread. More... | |
static std::array< int64_t, 2 > | getTileShape (ArrayRef< int64_t > operandShape, Type elementType, int64_t lineSizeBits) |
Returns the number of 8 x [128|256|512] bit tiles that compose the given operand shape. More... | |
static AffineMap | getRegisterIndexToTileOffsetMap (int64_t lineSize, Type elementType, ArrayRef< int64_t > operandShape, bool isAccumulator, int64_t elementsPerRegister, AffineExpr logicalValueId) |
Variables | |
static constexpr int64_t | kThreadsPerRow = 4 |
There are always 4 threads per [128|256|512] bit row. More... | |
static constexpr int64_t | kNumRowsPerTile = 8 |
|
static |
Definition at line 154 of file MMAUtils.cpp.
References mlir::AffineExpr::floorDiv(), mlir::AffineMap::get(), mlir::Type::getContext(), mlir::Type::getIntOrFloatBitWidth(), and getTileShape().
Referenced by mlir::nvgpu::getLaneIdAndValueIdToOperandCoord().
|
static |
Returns the number of 8 x [128|256|512] bit tiles that compose the given operand shape.
Definition at line 39 of file MMAUtils.cpp.
References mlir::Type::getIntOrFloatBitWidth(), and kNumRowsPerTile.
Referenced by getRegisterIndexToTileOffsetMap().
|
static |
Returns the number of registers which compose a matrix fragment held by a single thread.
Definition at line 29 of file MMAUtils.cpp.
References mlir::nvgpu::inferTileWidthInBits(), kNumRowsPerTile, and mlir::nvgpu::WarpMatrixInfo::vectorType.
Referenced by mlir::nvgpu::getMmaSyncRegisterType().
|
static |
Definition at line 23 of file MMAUtils.cpp.
References mlir::nvgpu::C.
Referenced by mlir::nvgpu::getLaneIdAndValueIdToOperandCoord(), mlir::nvgpu::getMmaSyncRegisterType(), and mlir::nvgpu::inferTileWidthInBits().
|
staticconstexpr |
Definition at line 21 of file MMAUtils.cpp.
Referenced by mlir::nvgpu::getLdMatrixParams(), getTileShape(), and inferNumRegistersPerMatrixFragment().
|
staticconstexpr |
There are always 4 threads per [128|256|512] bit row.
Definition at line 20 of file MMAUtils.cpp.
Referenced by mlir::nvgpu::getLaneIdAndValueIdToOperandCoord().