MLIR 23.0.0git
LowerGpuOpsToROCDLOps.cpp File Reference

Go to the source code of this file.

Classes

class  mlir::impl::ConvertGpuOpsToROCDLOpsBase< DerivedT >

Namespaces

namespace  mlir
 Include the generated interface declarations.
namespace  mlir::impl
 Attribute collections provide a dictionary-like interface.

Macros

#define GEN_PASS_DEF_CONVERTGPUOPSTOROCDLOPS

Functions

std::unique_ptr<::mlir::Passmlir::impl::createConvertGpuOpsToROCDLOps ()
std::unique_ptr<::mlir::Passmlir::impl::createConvertGpuOpsToROCDLOps (ConvertGpuOpsToROCDLOpsOptions options)
std::unique_ptr<::mlir::Passmlir::createConvertGpuOpsToROCDLOps ()
std::unique_ptr<::mlir::Passmlir::createConvertGpuOpsToROCDLOps (ConvertGpuOpsToROCDLOpsOptions options)
static Value truncOrExtToLLVMType (ConversionPatternRewriter &rewriter, Location loc, Value value, const LLVMTypeConverter &converter)
static bool canBeCalledWithBarePointers (gpu::GPUFuncOp func)
 Returns true if the given gpu.func can be safely called using the bare pointer calling convention.
static Value getLaneId (RewriterBase &rewriter, Location loc)
static Value getKnownOrOcklDim (RewriterBase &rewriter, gpu::index_lowering::IndexKind indexKind, gpu::Dimension dim, Operation *contextOp, std::optional< uint32_t > opUpperBound)
 Emits a call to an OCKL block/grid size function corresponding to indexKind with argument dim, except that if the context around contextOp gives an exact size for that dimension, return that as an i64 constant instead.

Variables

static constexpr int64_t kMaxThreadsPerBlockDim = 1024
 Maximum number of threads per block dimension on AMD GPUs.
static constexpr StringLiteral amdgcnDataLayout

Macro Definition Documentation

◆ GEN_PASS_DEF_CONVERTGPUOPSTOROCDLOPS

#define GEN_PASS_DEF_CONVERTGPUOPSTOROCDLOPS

Definition at line 46 of file LowerGpuOpsToROCDLOps.cpp.

Function Documentation

◆ canBeCalledWithBarePointers()

bool canBeCalledWithBarePointers ( gpu::GPUFuncOp func)
static

Returns true if the given gpu.func can be safely called using the bare pointer calling convention.

Definition at line 73 of file LowerGpuOpsToROCDLOps.cpp.

References mlir::LLVMTypeConverter::canConvertToBarePtr().

◆ getKnownOrOcklDim()

Value getKnownOrOcklDim ( RewriterBase & rewriter,
gpu::index_lowering::IndexKind indexKind,
gpu::Dimension dim,
Operation * contextOp,
std::optional< uint32_t > opUpperBound )
static

Emits a call to an OCKL block/grid size function corresponding to indexKind with argument dim, except that if the context around contextOp gives an exact size for that dimension, return that as an i64 constant instead.

Definition at line 112 of file LowerGpuOpsToROCDLOps.cpp.

References mlir::Builder::getArrayAttr(), mlir::Operation::getContext(), mlir::Builder::getDictionaryAttr(), mlir::Builder::getI64IntegerAttr(), mlir::gpu::getKnownDimensionSizeAround(), mlir::Operation::getLoc(), mlir::Builder::getNamedAttr(), mlir::getOrDefineFunction(), mlir::Operation::getParentWithTrait(), and kMaxThreadsPerBlockDim.

◆ getLaneId()

◆ truncOrExtToLLVMType()

Value truncOrExtToLLVMType ( ConversionPatternRewriter & rewriter,
Location loc,
Value value,
const LLVMTypeConverter & converter )
static

Variable Documentation

◆ amdgcnDataLayout

StringLiteral amdgcnDataLayout
staticconstexpr
Initial value:
=
"e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32"
"-p7:160:256:256:32-p8:128:128:128:48-p9:192:256:256:32-i64:64-v16:16-v24:"
"32-v32:"
"32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:"
"64-S32-A5-G1-ni:7:8:9"

Definition at line 170 of file LowerGpuOpsToROCDLOps.cpp.

◆ kMaxThreadsPerBlockDim

int64_t kMaxThreadsPerBlockDim = 1024
staticconstexpr

Maximum number of threads per block dimension on AMD GPUs.

Definition at line 106 of file LowerGpuOpsToROCDLOps.cpp.

Referenced by getKnownOrOcklDim().