MLIR 22.0.0git
GPUTransformOps.h
Go to the documentation of this file.
1//===- GPUTransformOps.h - GPU transform ops --------------------*- C++ -*-===//
2//
3// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4// See https://llvm.org/LICENSE.txt for license information.
5// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6//
7//===----------------------------------------------------------------------===//
8
9#ifndef MLIR_DIALECT_GPU_TRANSFORMOPS_GPUTRANSFORMOPS_H
10#define MLIR_DIALECT_GPU_TRANSFORMOPS_GPUTRANSFORMOPS_H
11
15
16namespace mlir {
17namespace gpu {
18class GpuOp;
19} // namespace gpu
20} // namespace mlir
21
22//===----------------------------------------------------------------------===//
23// GPU Transform Operations
24//===----------------------------------------------------------------------===//
25
26#define GET_OP_CLASSES
27#include "mlir/Dialect/GPU/TransformOps/GPUTransformOps.h.inc"
28
29namespace mlir {
30class DialectRegistry;
31namespace transform {
32namespace gpu {
33struct GpuIdBuilder;
34
35/// Map the top level `scf.forall` op to GPU blocks.
36/// Mapping is one-to-one and the induction variables of `scf.forall` are
37/// rewritten to gpu.block_id according to the thread_dim_mapping attribute.
38///
39/// Dynamic, `scf.forall` trip counts are currently not supported.
40/// Dynamic `gridDims` are currently not supported.
42mapForallToBlocksImpl(RewriterBase &rewriter, TransformOpInterface transformOp,
43 scf::ForallOp forallOp,
45 const GpuIdBuilder &gpuIdBuilder);
46
47/// Search `scf.forall` ops nested under `target` and map each such op to an
48/// explicit GPU implementation along `blockDims`.
49/// The mapping is one-to-one and the induction variables of `scf.forall` are
50/// rewritten to gpuIdBuilder.idBuilder according to the
51/// gpuIdBuilder.mappingAttributes attribute.
52///
53/// Dynamic, `scf.forall` trip counts are currently not supported.
54/// Dynamic `blockDims` sizes are currently not supported.
55/// `blockDims` is expected to be of size 3.
58 std::optional<TransformOpInterface> transformOp,
59 scf::ForallOp forallOp, ArrayRef<int64_t> blockSizes,
60 int64_t warpSize, bool syncAfterDistribute);
61
62/// Search `scf.forall` ops nested under `target` and map each such op to an
63/// explicit GPU implementation along `blockDims`.
64/// The mapping is one-to-one and the induction variables of `scf.forall` are
65/// rewritten to appropriate ids according to the mapping attribute.
66///
67/// Dynamic, `scf.forall` trip counts are currently not supported.
68/// Dynamic `blockDims` or `newBasis` entries are currently not
69/// supported. `blockDims` is expected to be of size 3.
70///
71/// The insertion point of the `rewriter` is expected to be set at the
72/// beginning of the `target` body block and dominate all other blocks.
75 std::optional<TransformOpInterface> transformOp,
77 int64_t warpSize, bool syncAfterDistribute);
78
79} // namespace gpu
80} // namespace transform
81
82namespace gpu {
84} // namespace gpu
85} // namespace mlir
86
87#endif // MLIR_DIALECT_GPU_TRANSFORMOPS_GPUTRANSFORMOPS_H
The result of a transform IR operation application.
The DialectRegistry maps a dialect namespace to a constructor for the matching dialect.
Operation is the basic unit of execution within MLIR.
Definition Operation.h:88
This class coordinates the application of a rewrite on a set of IR, providing a way for clients to tr...
void registerTransformDialectExtension(DialectRegistry &registry)
DiagnosedSilenceableFailure mapForallToBlocksImpl(RewriterBase &rewriter, TransformOpInterface transformOp, scf::ForallOp forallOp, SmallVectorImpl< int64_t > &gridDims, const GpuIdBuilder &gpuIdBuilder)
Map the top level scf.forall op to GPU blocks.
DiagnosedSilenceableFailure mapNestedForallToThreadsImpl(RewriterBase &rewriter, std::optional< TransformOpInterface > transformOp, Operation *target, ArrayRef< int64_t > blockDims, int64_t warpSize, bool syncAfterDistribute)
Search scf.forall ops nested under target and map each such op to an explicit GPU implementation alon...
DiagnosedSilenceableFailure mapOneForallToThreadsImpl(RewriterBase &rewriter, std::optional< TransformOpInterface > transformOp, scf::ForallOp forallOp, ArrayRef< int64_t > blockSizes, int64_t warpSize, bool syncAfterDistribute)
Search scf.forall ops nested under target and map each such op to an explicit GPU implementation alon...
Include the generated interface declarations.
Helper struct for configuring the rewrite of mapped scf.forall ops to various gpu id configurations.
Definition Utils.h:60