MLIR 22.0.0git
GPUHeuristics.cpp File Reference
#include "mlir/Dialect/Linalg/TransformOps/GPUHeuristics.h"
#include "mlir/Dialect/GPU/IR/GPUDialect.h"
#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/Support/Debug.h"
#include "llvm/Support/DebugLog.h"
#include "llvm/Support/InterleavedRange.h"
#include "llvm/Support/MathExtras.h"
#include "llvm/Support/raw_ostream.h"
#include <cmath>
#include <numeric>

Go to the source code of this file.

Macros

#define DEBUG_TYPE   "linalg-transforms"

Functions

static Attribute linearId0 (MLIRContext *ctx)
static Attribute linearId1 (MLIRContext *ctx)
static Attribute linearId2 (MLIRContext *ctx)
static SmallVector< int64_tgetFactors (int64_t val)
 Get the list of all factors that divide val, not just the prime factors.
static int64_t product (ArrayRef< int64_t > vals)
static SmallVector< int64_tmaximizeNumThreads (ArrayRef< int64_t > sizes, int64_t currentIndex, int64_t maxNumThreads)
 Extract result from sizes with the following constraints:

Macro Definition Documentation

◆ DEBUG_TYPE

#define DEBUG_TYPE   "linalg-transforms"

Definition at line 24 of file GPUHeuristics.cpp.

Function Documentation

◆ getFactors()

SmallVector< int64_t > getFactors ( int64_t val)
static

Get the list of all factors that divide val, not just the prime factors.

Definition at line 100 of file GPUHeuristics.cpp.

Referenced by maximizeNumThreads().

◆ linearId0()

Attribute linearId0 ( MLIRContext * ctx)
static

◆ linearId1()

Attribute linearId1 ( MLIRContext * ctx)
static

◆ linearId2()

Attribute linearId2 ( MLIRContext * ctx)
static

◆ maximizeNumThreads()

SmallVector< int64_t > maximizeNumThreads ( ArrayRef< int64_t > sizes,
int64_t currentIndex,
int64_t maxNumThreads )
static

Extract result from sizes with the following constraints:

  1. sizes[i] % result[i] for all i
  2. product_of_threadsPerDim <= maxNumThreads
  3. if currentIndex is sizes.size() - 1, then threadsPerDim[currentIndex] must be sizes[currentIndex]. This is used to greedily extract the maximum number of threads usable for mapping a copy of size sizes, while being bounded by totalNumThreads and ensuring coalesced access along the most minor dimension. Return the number of threads used in the range: threadsPerDim[currentIndex .. sizes.end()]

Definition at line 133 of file GPUHeuristics.cpp.

References getFactors(), maximizeNumThreads(), and product().

Referenced by maximizeNumThreads().

◆ product()