MLIR 22.0.0git
SCFToGPU.cpp File Reference

Go to the source code of this file.

Macros

#define DEBUG_TYPE   "loops-to-gpu"

Functions

static Value getDim3Value (const gpu::KernelDim3 &dim3, unsigned pos)
static Operation::operand_range getLowerBoundOperands (AffineForOp forOp)
static Operation::operand_range getUpperBoundOperands (AffineForOp forOp)
static Value getOrCreateStep (AffineForOp forOp, OpBuilder &builder)
static Value getOrEmitLowerBound (AffineForOp forOp, OpBuilder &builder)
static Value getOrEmitUpperBound (AffineForOp forOp, OpBuilder &builder)
static LogicalResult checkAffineLoopNestMappableImpl (AffineForOp forOp, unsigned numDims)
static LogicalResult checkAffineLoopNestMappable (AffineForOp forOp, unsigned numBlockDims, unsigned numThreadDims)
static LogicalResult convertAffineLoopNestToGPULaunch (AffineForOp forOp, unsigned numBlockDims, unsigned numThreadDims)
static Value deriveStaticUpperBound (Value upperBound, PatternRewriter &rewriter)
 Tries to derive a static upper bound from the defining operation of upperBound.
static bool isMappedToProcessor (gpu::Processor processor)
static unsigned getLaunchOpArgumentNum (gpu::Processor processor)
static LogicalResult processParallelLoop (ParallelOp parallelOp, gpu::LaunchOp launchOp, IRMapping &cloningMap, SmallVectorImpl< Operation * > &worklist, DenseMap< gpu::Processor, Value > &bounds, PatternRewriter &rewriter)
 Modifies the current transformation state to capture the effect of the given scf.parallel operation on index substitutions and the operations to be inserted.

Variables

static constexpr StringLiteral kVisitedAttrName = "SCFToGPU_visited"

Macro Definition Documentation

◆ DEBUG_TYPE

#define DEBUG_TYPE   "loops-to-gpu"

Definition at line 35 of file SCFToGPU.cpp.

Function Documentation

◆ checkAffineLoopNestMappable()

LogicalResult checkAffineLoopNestMappable ( AffineForOp forOp,
unsigned numBlockDims,
unsigned numThreadDims )
static

Definition at line 135 of file SCFToGPU.cpp.

References checkAffineLoopNestMappableImpl(), and success().

Referenced by convertAffineLoopNestToGPULaunch().

◆ checkAffineLoopNestMappableImpl()

LogicalResult checkAffineLoopNestMappableImpl ( AffineForOp forOp,
unsigned numDims )
static

◆ convertAffineLoopNestToGPULaunch()

LogicalResult convertAffineLoopNestToGPULaunch ( AffineForOp forOp,
unsigned numBlockDims,
unsigned numThreadDims )
static

Definition at line 280 of file SCFToGPU.cpp.

References checkAffineLoopNestMappable(), and success().

◆ deriveStaticUpperBound()

Value deriveStaticUpperBound ( Value upperBound,
PatternRewriter & rewriter )
static

Tries to derive a static upper bound from the defining operation of upperBound.

Definition at line 313 of file SCFToGPU.cpp.

References mlir::arith::ConstantIndexOp::create(), deriveStaticUpperBound(), mlir::Value::getDefiningOp(), lhs, result, and rhs.

Referenced by deriveStaticUpperBound(), and processParallelLoop().

◆ getDim3Value()

Value getDim3Value ( const gpu::KernelDim3 & dim3,
unsigned pos )
static

◆ getLaunchOpArgumentNum()

unsigned getLaunchOpArgumentNum ( gpu::Processor processor)
static

Definition at line 359 of file SCFToGPU.cpp.

Referenced by processParallelLoop().

◆ getLowerBoundOperands()

Operation::operand_range getLowerBoundOperands ( AffineForOp forOp)
static

Definition at line 75 of file SCFToGPU.cpp.

Referenced by checkAffineLoopNestMappableImpl().

◆ getOrCreateStep()

Value getOrCreateStep ( AffineForOp forOp,
OpBuilder & builder )
static

Definition at line 86 of file SCFToGPU.cpp.

References mlir::arith::ConstantIndexOp::create().

◆ getOrEmitLowerBound()

Value getOrEmitLowerBound ( AffineForOp forOp,
OpBuilder & builder )
static

Definition at line 93 of file SCFToGPU.cpp.

References mlir::lowerAffineLowerBound().

◆ getOrEmitUpperBound()

Value getOrEmitUpperBound ( AffineForOp forOp,
OpBuilder & builder )
static

Definition at line 99 of file SCFToGPU.cpp.

References mlir::lowerAffineUpperBound().

◆ getUpperBoundOperands()

Operation::operand_range getUpperBoundOperands ( AffineForOp forOp)
static

Definition at line 80 of file SCFToGPU.cpp.

Referenced by checkAffineLoopNestMappableImpl().

◆ isMappedToProcessor()

bool isMappedToProcessor ( gpu::Processor processor)
static

Definition at line 355 of file SCFToGPU.cpp.

Referenced by processParallelLoop().

◆ processParallelLoop()

LogicalResult processParallelLoop ( ParallelOp parallelOp,
gpu::LaunchOp launchOp,
IRMapping & cloningMap,
SmallVectorImpl< Operation * > & worklist,
DenseMap< gpu::Processor, Value > & bounds,
PatternRewriter & rewriter )
static

Modifies the current transformation state to capture the effect of the given scf.parallel operation on index substitutions and the operations to be inserted.

Specifically, if a dimension of a parallel loop is mapped to a hardware id, this function will

  • compute the loop index based on the hardware id and affine map from the mapping and update cloningMap to substitute all uses.
  • derive a new upper bound for the hardware id and augment the provided gpu.launch operation accordingly.
  • if the upper bound is imprecise, insert a conditional in the gpu.launch and update the rewriter to insert into the conditional's body. If the dimension is mapped to sequential,
  • insert a for loop into the body and update the rewriter to insert into the for loop's body.
  • update the cloningMap to replace uses of the index with the index of the new for loop. In either case,
  • append the instructions from the loops body to worklist, in reverse order. To note the end of the current scope in case a loop or conditional was inserted, a sentinel (the gpu.launch operation) is inserted into the worklist. This signals the processor of the worklist to pop the rewriter one scope-level up.

Definition at line 401 of file SCFToGPU.cpp.

References ArrayAttr(), mlir::config, deriveStaticUpperBound(), mlir::AffineMap::get(), mlir::Builder::getAffineDimExpr(), mlir::Builder::getAffineSymbolExpr(), mlir::Value::getDefiningOp(), getLaunchOpArgumentNum(), mlir::gpu::getMappingAttrName(), mlir::Operation::getOperands(), mlir::Block::getOperations(), mlir::Block::getTerminator(), isMappedToProcessor(), mlir::IRMapping::lookupOrDefault(), mlir::IRMapping::map(), mlir::RewriterBase::notifyMatchFailure(), mlir::OpBuilder::setInsertionPoint(), mlir::OpBuilder::setInsertionPointToStart(), success(), and mlir::Block::without_terminator().

Variable Documentation

◆ kVisitedAttrName

StringLiteral kVisitedAttrName = "SCFToGPU_visited"
staticconstexpr