MLIR

Multi-Level IR Compiler Framework

'nvvm' Dialect

Operations 

source

nvvm.bar.warp.sync (NVVM::SyncWarpOp) 

Syntax:

operation ::= `nvvm.bar.warp.sync` $mask attr-dict `:` type($mask)

Operands: 

OperandDescription
maskLLVM dialect-compatible type

nvvm.barrier (NVVM::BarrierOp) 

Syntax:

operation ::= `nvvm.barrier` (`id` `=` $barrierId^)? (`number_of_threads` `=` $numberOfThreads^)? attr-dict

Traits: AttrSizedOperandSegments

Operands: 

OperandDescription
barrierId32-bit signless integer
numberOfThreads32-bit signless integer

nvvm.barrier.arrive (NVVM::BarrierArriveOp) 

Syntax:

operation ::= `nvvm.barrier.arrive` (`id` `=` $barrierId^)? `number_of_threads` `=` $numberOfThreads attr-dict

Thread that executes this op announces their arrival at the barrier with given id and continue their execution.

The default barrier id is 0 that is similar to nvvm.barrier Op. When barrierId is not present, the default barrier id is used.

[For more information, see PTX ISA] ( https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-bar)

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
barrierId32-bit signless integer
numberOfThreads32-bit signless integer

nvvm.barrier0 (NVVM::Barrier0Op) 

Syntax:

operation ::= `nvvm.barrier0` attr-dict

nvvm.breakpoint (NVVM::Breakpoint) 

Breakpoint Op

Syntax:

operation ::= `nvvm.breakpoint` attr-dict

Breakpoint suspends execution of the program for debugging. For more information, see PTX ISA

nvvm.cluster.arrive (NVVM::ClusterArriveOp) 

Cluster Barrier Arrive Op

Syntax:

operation ::= `nvvm.cluster.arrive` attr-dict

The cluster.arrive can be used by the threads within the cluster for synchronization and communication. The cluster.arrive instruction marks the warps’ arrival at the barrier without causing the executing thread to wait for other participating threads.

The aligned attribute, when provided, generates the .aligned version of the PTX instruction.

[For more information, see PTX ISA] ( https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-barrier-cluster)

Attributes: 

AttributeMLIR TypeDescription
aligned::mlir::UnitAttrunit attribute

nvvm.cluster.arrive.relaxed (NVVM::ClusterArriveRelaxedOp) 

Cluster Barrier Relaxed Arrive Op

Syntax:

operation ::= `nvvm.cluster.arrive.relaxed` attr-dict

The cluster.arrive can be used by the threads within the cluster for synchronization and communication. The cluster.arrive instruction marks the warps’ arrival at the barrier without causing the executing thread to wait for other participating threads.

The aligned attribute, when provided, generates the .aligned version of the PTX instruction. The .relaxed qualifier on cluster.arrive specifies that there are no memory ordering and visibility guarantees provided for the memory accesses performed prior to cluster.arrive.

[For more information, see PTX ISA] ( https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-barrier-cluster)

Attributes: 

AttributeMLIR TypeDescription
aligned::mlir::UnitAttrunit attribute

nvvm.cluster.wait (NVVM::ClusterWaitOp) 

Cluster Barrier Wait Op

Syntax:

operation ::= `nvvm.cluster.wait` attr-dict

The cluster.wait causes the executing thread to wait for all non-exited threads of the cluster to perform cluster.arrive. The aligned attribute, when provided, generates the .aligned version of the PTX instruction.

[For more information, see PTX ISA] ( https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-barrier-cluster)

Attributes: 

AttributeMLIR TypeDescription
aligned::mlir::UnitAttrunit attribute

nvvm.cp.async.bulk.commit.group (NVVM::CpAsyncBulkCommitGroupOp) 

Syntax:

operation ::= `nvvm.cp.async.bulk.commit.group` attr-dict

This Op commits all prior initiated but uncommitted cp.async.bulk instructions into a cp.async.bulk-group.

[For more information, see PTX ISA] ( https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-cp-async-bulk-commit-group)

nvvm.cp.async.bulk.tensor.global.shared.cta (NVVM::CpAsyncBulkTensorSharedCTAToGlobalOp) 

Syntax:

operation ::= `nvvm.cp.async.bulk.tensor.global.shared.cta` $tmaDescriptor `,`
              $srcMem `,`
              `box` `[`$coordinates `]`
              (`,` `predicate` `=` $predicate^)?
              attr-dict  `:` type(operands)

Traits: AttrSizedOperandSegments

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
tmaDescriptorLLVM pointer type
srcMemLLVM pointer in address space 3
coordinatesvariadic of 32-bit signless integer
predicate1-bit signless integer

nvvm.cp.async.bulk.tensor.prefetch (NVVM::CpAsyncBulkTensorPrefetchOp) 

Syntax:

operation ::= `nvvm.cp.async.bulk.tensor.prefetch` $tmaDescriptor `,`
              `box` `[`$coordinates `]`
              (`im2col` `[` $im2colOffsets^ `]` )?
              (`l2_cache_hint` `=` $l2CacheHint^ )?
              attr-dict  `:` type($tmaDescriptor)

Initiates an asynchronous prefetch operation on the tensor data from global memory to L2 cache.

The Op has two modes:

  1. Tiled Mode: It’s the default mode. The source multi-dimensional tensor layout is preserved at the destination.

  2. Im2col Mode: This mode is used when im2colOffsets operands are present. the elements in the Bounding Box of the source tensor are rearranged into columns at the destination. In this mode, the tensor has to be at least 3-dimensional.

The l2CacheHint operand is optional, and it is used to specify cache eviction policy that may be used during the memory access.

[For more information, see PTX ISA] ( https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-cp-async-bulk-prefetch-tensor)

Traits: AttrSizedOperandSegments

Operands: 

OperandDescription
tmaDescriptorLLVM pointer type
coordinatesvariadic of 32-bit signless integer
im2colOffsetsvariadic of 16-bit signless integer
l2CacheHint64-bit signless integer

nvvm.cp.async.bulk.tensor.reduce (NVVM::CpAsyncBulkTensorReduceOp) 

Syntax:

operation ::= `nvvm.cp.async.bulk.tensor.reduce` $tmaDescriptor `,`
              $srcMem `,`
              `box` `[`$coordinates `]`
              (`l2_cache_hint` `=` $l2CacheHint^ )?
              attr-dict  `:` type($tmaDescriptor) `,` type($srcMem)

Initiates an asynchronous reduction operation of tensor data in global memory with tensor data in shared memory.

The mode attribute indicates whether the copy mode is tile or im2col. The redOp attribute specifies the reduction operations applied. The supported reduction operations are: {add, min, max, inc, dec, and, or, xor}

The l2CacheHint operand is optional, and it is used to specify cache eviction policy that may be used during the memory access.

[For more information, see PTX ISA] ( https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-cp-reduce-async-bulk-tensor)

Traits: AttrSizedOperandSegments

Attributes: 

AttributeMLIR TypeDescription
redKind::mlir::NVVM::TMAReduxKindAttr
NVVM TMA redux kind

Enum cases:

  • add (ADD)
  • max (MAX)
  • min (MIN)
  • inc (INC)
  • dec (DEC)
  • and (AND)
  • or (OR)
  • xor (XOR)
mode::mlir::NVVM::TMAStoreModeAttr
NVVM TMA Store Mode

Enum cases:

  • tile (TILE)
  • im2col (IM2COL)

Operands: 

OperandDescription
tmaDescriptorLLVM pointer type
srcMemLLVM pointer in address space 3
coordinatesvariadic of 32-bit signless integer
l2CacheHint64-bit signless integer

nvvm.cp.async.bulk.tensor.shared.cluster.global (NVVM::CpAsyncBulkTensorGlobalToSharedClusterOp) 

Syntax:

operation ::= `nvvm.cp.async.bulk.tensor.shared.cluster.global` $dstMem `,`
              $tmaDescriptor `,`
              $mbar `,`
              `box` `[`$coordinates `]`
              (`im2col` `[` $im2colOffsets^ `]` )?
              (`multicast_mask` `=` $multicastMask^ )?
              (`l2_cache_hint` `=` $l2CacheHint^ )?
              (`predicate` `=` $predicate^)?
              attr-dict  `:` type($dstMem) `,` type($tmaDescriptor)

Initiates an asynchronous copy operation on the tensor data from global memory to shared memory.

The Op operates has two load modes:

  1. Tiled Mode: It’s the default mode. The source multi-dimensional tensor layout is preserved at the destination.

  2. Im2col Mode: This mode is used when im2colOffsets operands are present. the elements in the Bounding Box of the source tensor are rearranged into columns at the destination. In this mode, the tensor has to be at least 3-dimensional.

The multicastMask operand is optional. When it is present, the Op copies data from global memory to shared memory of multiple CTAs in the cluster. Operand multicastMask specifies the destination CTAs in the cluster such that each bit position in the 16-bit multicastMask operand corresponds to the nvvm.read.ptx.sreg.ctaid of the destination CTA.

The l2CacheHint operand is optional, and it is used to specify cache eviction policy that may be used during the memory access.

[For more information, see PTX ISA] ( https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-cp-async-bulk-tensor)

Traits: AttrSizedOperandSegments

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
dstMemLLVM pointer in address space 3
tmaDescriptorLLVM pointer type
coordinatesvariadic of 32-bit signless integer
mbarLLVM pointer in address space 3
im2colOffsetsvariadic of 16-bit signless integer
multicastMask16-bit signless integer
l2CacheHint64-bit signless integer
predicate1-bit signless integer

nvvm.cp.async.bulk.wait_group (NVVM::CpAsyncBulkWaitGroupOp) 

Syntax:

operation ::= `nvvm.cp.async.bulk.wait_group` $group attr-dict

Op waits for completion of the most recent bulk async-groups.

The $group operand tells waiting has to be done until for $group or fewer of the most recent bulk async-groups. If $group is 0, the op wait until all the most recent bulk async-groups have completed.

The $read indicates that the waiting has to be done until all the bulk async operations in the specified bulk async-group have completed reading from their source locations.

[For more information, see PTX ISA] ( https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-cp-async-bulk-wait-group)

Attributes: 

AttributeMLIR TypeDescription
group::mlir::IntegerAttr32-bit signless integer attribute whose minimum value is 0
read::mlir::UnitAttrunit attribute

nvvm.cp.async.commit.group (NVVM::CpAsyncCommitGroupOp) 

Syntax:

operation ::= `nvvm.cp.async.commit.group` attr-dict

nvvm.cp.async.mbarrier.arrive (NVVM::CpAsyncMBarrierArriveOp) 

NVVM Dialect Op for cp.async.mbarrier.arrive

Syntax:

operation ::= `nvvm.cp.async.mbarrier.arrive` $addr attr-dict `:` type(operands)

The cp.async.mbarrier.arrive Op makes the mbarrier object track all prior cp.async operations initiated by the executing thread. The addr operand specifies the address of the mbarrier object in generic address space. The noinc attr impacts how the mbarrier’s state is updated. [For more information, refer PTX ISA] ( https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-cp-async-mbarrier-arrive)

Attributes: 

AttributeMLIR TypeDescription
noinc::mlir::IntegerAttr1-bit signless integer attribute

Operands: 

OperandDescription
addrLLVM pointer type

nvvm.cp.async.mbarrier.arrive.shared (NVVM::CpAsyncMBarrierArriveSharedOp) 

NVVM Dialect Op for cp.async.mbarrier.arrive.shared

Syntax:

operation ::= `nvvm.cp.async.mbarrier.arrive.shared` $addr attr-dict `:` type(operands)

The cp.async.mbarrier.arrive.shared Op makes the mbarrier object track all prior cp.async operations initiated by the executing thread. The addr operand specifies the address of the mbarrier object in shared memory. The noinc attr impacts how the mbarrier’s state is updated. [For more information, refer PTX ISA] ( https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-cp-async-mbarrier-arrive)

Attributes: 

AttributeMLIR TypeDescription
noinc::mlir::IntegerAttr1-bit signless integer attribute

Operands: 

OperandDescription
addrLLVM pointer in address space 3

nvvm.cp.async.shared.global (NVVM::CpAsyncOp) 

Syntax:

operation ::= `nvvm.cp.async.shared.global` $dst `,` $src `,` $size `,` `cache` `=` $modifier (`,` $cpSize^)? attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Attributes: 

AttributeMLIR TypeDescription
size::mlir::IntegerAttr32-bit signless integer attribute
modifier::mlir::NVVM::LoadCacheModifierKindAttr
NVVM load cache modifier kind
Enum attribute of the different kinds of cache operators for load instructions.

For more information, see PTX ISA

Operands: 

OperandDescription
dstLLVM pointer in address space 3
srcLLVM pointer in address space 1
cpSizeLLVM dialect-compatible type

nvvm.cp.async.wait.group (NVVM::CpAsyncWaitGroupOp) 

Syntax:

operation ::= `nvvm.cp.async.wait.group` $n attr-dict

Attributes: 

AttributeMLIR TypeDescription
n::mlir::IntegerAttr32-bit signless integer attribute

nvvm.elect.sync (NVVM::ElectSyncOp) 

Elect one leader thread

Syntax:

operation ::= `nvvm.elect.sync` attr-dict `->` type(results)

The elect.sync instruction elects one predicated active leader thread from among a set of threads specified in membermask. The membermask is set to 0xFFFFFFFF for the current version of this Op. The predicate result is set to True for the leader thread, and False for all other threads.

[For more information, see PTX ISA] ( https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-elect-sync)

Results: 

ResultDescription
pred1-bit signless integer

nvvm.exit (NVVM::Exit) 

Exit Op

Syntax:

operation ::= `nvvm.exit` attr-dict

Ends execution of a thread. For more information, see PTX ISA

nvvm.fence.mbarrier.init (NVVM::FenceMbarrierInitOp) 

Syntax:

operation ::= `nvvm.fence.mbarrier.init` attr-dict

Fence operation that applies on the prior nvvm.mbarrier.init [For more information, see PTX ISA] ( https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-membar)

Interfaces: BasicPtxBuilderInterface

nvvm.fence.proxy (NVVM::FenceProxyOp) 

Syntax:

operation ::= `nvvm.fence.proxy` attr-dict

Fence operation with proxy to establish an ordering between memory accesses that may happen through different proxies. [For more information, see PTX ISA] ( https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-membar)

Interfaces: BasicPtxBuilderInterface

Attributes: 

AttributeMLIR TypeDescription
kind::mlir::NVVM::ProxyKindAttr
Proxy kind

Enum cases:

  • alias (alias)
  • async (async)
  • async.global (async_global)
  • async.shared (async_shared)
  • tensormap (TENSORMAP)
  • generic (GENERIC)
space::mlir::NVVM::SharedSpaceAttr
Shared memory space

Enum cases:

  • cta (shared_cta)
  • cluster (shared_cluster)

nvvm.fence.proxy.acquire (NVVM::FenceProxyAcquireOp) 

Uni-directional proxy fence operation with acquire semantics

Syntax:

operation ::= `nvvm.fence.proxy.acquire` $scope $addr `,` $size (`from_proxy` `=` $fromProxy^)? (`to_proxy` `=` $toProxy^)? attr-dict

fence.proxy.acquire is a uni-directional fence used to establish ordering between a prior memory access performed via the generic proxy and a subsequent memory access performed via the tensormap proxy

The address operand addr and the operand size together specify the memory range [addr, addr+size) on which the ordering guarantees on the memory accesses across the proxies is to be provided. The only supported value for the size operand is 128 and must be an immediate. Generic Addressing is used unconditionally, and the address specified by the operand addr must fall within the .global state space. Otherwise, the behavior is undefined [For more information, see PTX ISA] ( https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-membar)

Attributes: 

AttributeMLIR TypeDescription
scope::mlir::NVVM::MemScopeKindAttr
NVVM Memory Scope kind

Enum cases:

  • cta (CTA)
  • cluster (CLUSTER)
  • gpu (GPU)
  • sys (SYS)
fromProxy::mlir::NVVM::ProxyKindAttr
Proxy kind

Enum cases:

  • alias (alias)
  • async (async)
  • async.global (async_global)
  • async.shared (async_shared)
  • tensormap (TENSORMAP)
  • generic (GENERIC)
toProxy::mlir::NVVM::ProxyKindAttr
Proxy kind

Enum cases:

  • alias (alias)
  • async (async)
  • async.global (async_global)
  • async.shared (async_shared)
  • tensormap (TENSORMAP)
  • generic (GENERIC)

Operands: 

OperandDescription
addrLLVM pointer in address space 0
size32-bit signless integer

nvvm.fence.proxy.release (NVVM::FenceProxyReleaseOp) 

Uni-directional proxy fence operation with release semantics

Syntax:

operation ::= `nvvm.fence.proxy.release` $scope (`from_proxy` `=` $fromProxy^)? (`to_proxy` `=` $toProxy^)? attr-dict

fence.proxy.release is a uni-directional fence used to establish ordering between a prior memory access performed via the generic proxy and a subsequent memory access performed via the tensormap proxy. fence.proxy.release operation can form a release sequence that synchronizes with an acquire sequence that contains the fence.proxy.acquire proxy fence operation [For more information, see PTX ISA] ( https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-membar)

Attributes: 

AttributeMLIR TypeDescription
scope::mlir::NVVM::MemScopeKindAttr
NVVM Memory Scope kind

Enum cases:

  • cta (CTA)
  • cluster (CLUSTER)
  • gpu (GPU)
  • sys (SYS)
fromProxy::mlir::NVVM::ProxyKindAttr
Proxy kind

Enum cases:

  • alias (alias)
  • async (async)
  • async.global (async_global)
  • async.shared (async_shared)
  • tensormap (TENSORMAP)
  • generic (GENERIC)
toProxy::mlir::NVVM::ProxyKindAttr
Proxy kind

Enum cases:

  • alias (alias)
  • async (async)
  • async.global (async_global)
  • async.shared (async_shared)
  • tensormap (TENSORMAP)
  • generic (GENERIC)

nvvm.fence.sc.cluster (NVVM::FenceScClusterOp) 

Syntax:

operation ::= `nvvm.fence.sc.cluster` attr-dict

nvvm.ldmatrix (NVVM::LdMatrixOp) 

Cooperative matrix load

Syntax:

operation ::= `nvvm.ldmatrix` $ptr attr-dict `:` functional-type($ptr, $res)

Attributes: 

AttributeMLIR TypeDescription
num::mlir::IntegerAttr32-bit signless integer attribute
layout::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)

Operands: 

OperandDescription
ptrLLVM pointer type

Results: 

ResultDescription
resany type

nvvm.mbarrier.arrive (NVVM::MBarrierArriveOp) 

Syntax:

operation ::= `nvvm.mbarrier.arrive` $addr attr-dict `:` type($addr) `->` type($res)

Operands: 

OperandDescription
addrLLVM pointer type

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.mbarrier.arrive.expect_tx (NVVM::MBarrierArriveExpectTxOp) 

Syntax:

operation ::= `nvvm.mbarrier.arrive.expect_tx` $addr `,` $txcount (`,` `predicate` `=` $predicate^)? attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
addrLLVM pointer type
txcount32-bit signless integer
predicate1-bit signless integer

nvvm.mbarrier.arrive.expect_tx.shared (NVVM::MBarrierArriveExpectTxSharedOp) 

Syntax:

operation ::= `nvvm.mbarrier.arrive.expect_tx.shared` $addr `,` $txcount (`,` `predicate` `=` $predicate^)? attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
addrLLVM pointer in address space 3
txcount32-bit signless integer
predicate1-bit signless integer

nvvm.mbarrier.arrive.nocomplete (NVVM::MBarrierArriveNocompleteOp) 

Syntax:

operation ::= `nvvm.mbarrier.arrive.nocomplete` $addr `,` $count attr-dict `:` type(operands) `->` type($res)

Operands: 

OperandDescription
addrLLVM pointer type
count32-bit signless integer

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.mbarrier.arrive.nocomplete.shared (NVVM::MBarrierArriveNocompleteSharedOp) 

Syntax:

operation ::= `nvvm.mbarrier.arrive.nocomplete.shared` $addr `,` $count attr-dict `:` type(operands) `->` type($res)

Operands: 

OperandDescription
addrLLVM pointer in address space 3
count32-bit signless integer

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.mbarrier.arrive.shared (NVVM::MBarrierArriveSharedOp) 

Syntax:

operation ::= `nvvm.mbarrier.arrive.shared` $addr attr-dict `:` qualified(type($addr)) `->` type($res)

Operands: 

OperandDescription
addrLLVM pointer in address space 3

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.mbarrier.init (NVVM::MBarrierInitOp) 

Syntax:

operation ::= `nvvm.mbarrier.init` $addr `,` $count (`,` `predicate` `=` $predicate^)? attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
addrLLVM pointer type
count32-bit signless integer
predicate1-bit signless integer

nvvm.mbarrier.init.shared (NVVM::MBarrierInitSharedOp) 

Syntax:

operation ::= `nvvm.mbarrier.init.shared` $addr `,` $count (`,` `predicate` `=` $predicate^)? attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
addrLLVM pointer in address space 3
count32-bit signless integer
predicate1-bit signless integer

nvvm.mbarrier.inval (NVVM::MBarrierInvalOp) 

Syntax:

operation ::= `nvvm.mbarrier.inval` $addr attr-dict `:` type(operands)

Operands: 

OperandDescription
addrLLVM pointer type

nvvm.mbarrier.inval.shared (NVVM::MBarrierInvalSharedOp) 

Syntax:

operation ::= `nvvm.mbarrier.inval.shared` $addr attr-dict `:` type(operands)

Operands: 

OperandDescription
addrLLVM pointer in address space 3

nvvm.mbarrier.test.wait (NVVM::MBarrierTestWaitOp) 

Syntax:

operation ::= `nvvm.mbarrier.test.wait` $addr `,` $state attr-dict `:` type(operands) `->` type($res)

Operands: 

OperandDescription
addrLLVM pointer type
stateLLVM dialect-compatible type

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.mbarrier.test.wait.shared (NVVM::MBarrierTestWaitSharedOp) 

Syntax:

operation ::= `nvvm.mbarrier.test.wait.shared` $addr `,` $state attr-dict `:` type(operands) `->` type($res)

Operands: 

OperandDescription
addrLLVM pointer in address space 3
stateLLVM dialect-compatible type

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.mbarrier.try_wait.parity (NVVM::MBarrierTryWaitParityOp) 

Syntax:

operation ::= `nvvm.mbarrier.try_wait.parity` $addr `,` $phase `,` $ticks attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
addrLLVM pointer type
phase32-bit signless integer
ticks32-bit signless integer

nvvm.mbarrier.try_wait.parity.shared (NVVM::MBarrierTryWaitParitySharedOp) 

Syntax:

operation ::= `nvvm.mbarrier.try_wait.parity.shared` $addr `,` $phase `,` $ticks attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
addrLLVM pointer in address space 3
phase32-bit signless integer
ticks32-bit signless integer

nvvm.mma.sync (NVVM::MmaOp) 

Cooperative matrix-multiply and accumulate

The nvvm.mma.sync operation collectively performs the operation D = matmul(A, B) + C using all threads in a warp.

All the threads in the warp must execute the same mma.sync operation.

For each possible multiplicand PTX data type, there are one or more possible instruction shapes given as “mMnNkK”. The below table describes the posssibilities as well as the types required for the operands. Note that the data type for C (the accumulator) and D (the result) can vary independently when there are multiple possibilities in the “C/D Type” column.

When an optional attribute cannot be immediately inferred from the types of the operands and the result during parsing or validation, an error will be raised.

b1Op is only relevant when the binary (b1) type is given to multiplicandDataType. It specifies how the multiply-and-acumulate is performed and is either xor_popc or and_poc. The default is xor_popc.

intOverflowBehavior is only relevant when the multiplicandType attribute is one of u8, s8, u4, s4, this attribute describes how overflow is handled in the accumulator. When the attribute is satfinite, the accumulator values are clamped in the int32 range on overflow. This is the default behavior. Alternatively, accumulator behavior wrapped can also be specified, in which case overflow wraps from one end of the range to the other.

layoutA and layoutB are required and should generally be set to #nvvm.mma_layout<row> and #nvvm.mma_layout<col> respectively, but other combinations are possible for certain layouts according to the table below.

| A/B Type | Shape     | ALayout | BLayout | A Type   | B Type   | C/D Type          |
|----------|-----------|---------|---------|----------|----------|-------------------|
| f64      | .m8n8k4   | row     | col     | 1x f64   | 1x f64   | 2x f64            |
| f16      | .m8n8k4   | row/col | row/col | 2x f16x2 | 2x f16x2 | 4x f16x2 or 8xf32 |
|          | .m16n8k8  | row     | col     | 2x f16x2 | 1x f16x2 | 2x f16x2 or 4 f32 |
|          | .m16n8k16 | row     | col     | 4x f16x2 | 2x f16x2 | 2x f16x2 or 4 f32 |
| bf16     | .m16n8k8  | row     | col     | 2x f16x2 | 1x f16x2 | 2x f16x2 or 4 f32 |
|          | .m16n8k16 | row     | col     | 4x f16x2 | 2x f16x2 | 2x f16x2 or 4 f32 |
| tf32     | .m16n8k4  | row     | col     | 2x i32   | 1x i32   | 4x f32            |
|          | .m16n8k8  | row     | col     | 4x i32   | 2x i32   | 2x f16x2 or 4 f32 |
| u8/s8    | .m8n8k16  | row     | col     | 1x i32   | 1x i32   | 2x i32            |
|          | .m16n8k16 | row     | col     | 2x i32   | 1x i32   | 4x i32            |
|          | .m16n8k32 | row     | col     | 4x i32   | 2x i32   | 4x i32            |
| u4/s4    | .m8n8k32  | row     | col     | 1x i32   | 1x i32   | 2x i32            |
|          | m16n8k32  | row     | col     | 2x i32   | 1x i32   | 4x i32            |
|          | m16n8k64  | row     | col     | 4x i32   | 2x i32   | 4x i32            |
| b1       | m8n8k128  | row     | col     | 1x i32   | 1x i32   | 2x i32            |
|          | m16n8k128 | row     | col     | 2x i32   | 1x i32   | 4x i32            |

Example:


%128 = nvvm.mma.sync A[%120, %121, %122, %123]
                     B[%124, %125]
                     C[%126, %127]
                     {layoutA = #nvvm.mma_layout<row>,
                      layoutB = #nvvm.mma_layout<col>,
                      shape = {k = 16 : i32, m = 16 : i32, n = 8 : i32}}
    : (vector<2xf16>, vector<2xf16>, vector<2xf16>)
       -> !llvm.struct<(vector<2xf16>, vector<2xf16>)>

Traits: AttrSizedOperandSegments

Attributes: 

AttributeMLIR TypeDescription
shape::mlir::NVVM::MMAShapeAttrAttribute for MMA operation shape.
b1Op::mlir::NVVM::MMAB1OpAttr
MMA binary operations

Enum cases:

  • none (none)
  • xor_popc (xor_popc)
  • and_popc (and_popc)
intOverflowBehavior::mlir::NVVM::MMAIntOverflowAttr
MMA overflow options

Enum cases:

  • satfinite (satfinite)
  • wrapped (wrapped)
layoutA::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)
layoutB::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)
multiplicandAPtxType::mlir::NVVM::MMATypesAttr
NVVM MMA types

Enum cases:

  • f16 (f16)
  • f32 (f32)
  • tf32 (tf32)
  • bf16 (bf16)
  • s8 (s8)
  • u8 (u8)
  • s32 (s32)
  • s4 (s4)
  • u4 (u4)
  • b1 (b1)
  • f64 (f64)
multiplicandBPtxType::mlir::NVVM::MMATypesAttr
NVVM MMA types

Enum cases:

  • f16 (f16)
  • f32 (f32)
  • tf32 (tf32)
  • bf16 (bf16)
  • s8 (s8)
  • u8 (u8)
  • s32 (s32)
  • s4 (s4)
  • u4 (u4)
  • b1 (b1)
  • f64 (f64)

Operands: 

OperandDescription
operandAvariadic of LLVM dialect-compatible type
operandBvariadic of LLVM dialect-compatible type
operandCvariadic of LLVM dialect-compatible type

Results: 

ResultDescription
resLLVM structure type

nvvm.prefetch.tensormap (NVVM::PrefetchTensorMapOp) 

Syntax:

operation ::= `nvvm.prefetch.tensormap` $tmaDescriptor (`,` `predicate` `=` $predicate^)? attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
tmaDescriptorLLVM pointer type
predicate1-bit signless integer

nvvm.rcp.approx.ftz.f (NVVM::RcpApproxFtzF32Op) 

Syntax:

operation ::= `nvvm.rcp.approx.ftz.f` $arg attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: 

OperandDescription
arg32-bit float

Results: 

ResultDescription
res32-bit float

nvvm.read.ptx.sreg.clock (NVVM::ClockOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.clock` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.clock64 (NVVM::Clock64Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.clock64` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.cluster.ctaid.x (NVVM::BlockInClusterIdXOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.ctaid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.cluster.ctaid.y (NVVM::BlockInClusterIdYOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.ctaid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.cluster.ctaid.z (NVVM::BlockInClusterIdZOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.ctaid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.cluster.ctarank (NVVM::ClusterId) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.ctarank` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.cluster.nctaid.x (NVVM::ClusterDimBlocksXOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.nctaid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.cluster.nctaid.y (NVVM::ClusterDimBlocksYOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.nctaid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.cluster.nctaid.z (NVVM::ClusterDimBlocksZOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.nctaid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.cluster.nctarank (NVVM::ClusterDim) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.nctarank` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.clusterid.x (NVVM::ClusterIdXOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.clusterid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.clusterid.y (NVVM::ClusterIdYOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.clusterid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.clusterid.z (NVVM::ClusterIdZOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.clusterid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.ctaid.x (NVVM::BlockIdXOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.ctaid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.ctaid.y (NVVM::BlockIdYOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.ctaid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.ctaid.z (NVVM::BlockIdZOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.ctaid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg0 (NVVM::EnvReg0Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg0` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg1 (NVVM::EnvReg1Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg1` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg10 (NVVM::EnvReg10Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg10` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg11 (NVVM::EnvReg11Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg11` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg12 (NVVM::EnvReg12Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg12` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg13 (NVVM::EnvReg13Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg13` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg14 (NVVM::EnvReg14Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg14` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg15 (NVVM::EnvReg15Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg15` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg16 (NVVM::EnvReg16Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg16` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg17 (NVVM::EnvReg17Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg17` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg18 (NVVM::EnvReg18Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg18` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg19 (NVVM::EnvReg19Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg19` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg2 (NVVM::EnvReg2Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg2` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg20 (NVVM::EnvReg20Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg20` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg21 (NVVM::EnvReg21Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg21` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg22 (NVVM::EnvReg22Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg22` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg23 (NVVM::EnvReg23Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg23` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg24 (NVVM::EnvReg24Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg24` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg25 (NVVM::EnvReg25Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg25` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg26 (NVVM::EnvReg26Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg26` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg27 (NVVM::EnvReg27Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg27` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg28 (NVVM::EnvReg28Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg28` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg29 (NVVM::EnvReg29Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg29` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg3 (NVVM::EnvReg3Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg3` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg30 (NVVM::EnvReg30Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg30` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg31 (NVVM::EnvReg31Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg31` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg4 (NVVM::EnvReg4Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg4` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg5 (NVVM::EnvReg5Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg5` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg6 (NVVM::EnvReg6Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg6` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg7 (NVVM::EnvReg7Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg7` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg8 (NVVM::EnvReg8Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg8` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg9 (NVVM::EnvReg9Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg9` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.globaltimer (NVVM::GlobalTimerOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.globaltimer` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.gridid (NVVM::GridIdOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.gridid` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.laneid (NVVM::LaneIdOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.laneid` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.lanemask.eq (NVVM::LaneMaskEqOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.lanemask.eq` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.lanemask.ge (NVVM::LaneMaskGeOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.lanemask.ge` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.lanemask.gt (NVVM::LaneMaskGtOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.lanemask.gt` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.lanemask.le (NVVM::LaneMaskLeOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.lanemask.le` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.lanemask.lt (NVVM::LaneMaskLtOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.lanemask.lt` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.nclusterid.x (NVVM::ClusterDimXOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.nclusterid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.nclusterid.y (NVVM::ClusterDimYOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.nclusterid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.nclusterid.z (NVVM::ClusterDimZOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.nclusterid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.nctaid.x (NVVM::GridDimXOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.nctaid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.nctaid.y (NVVM::GridDimYOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.nctaid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.nctaid.z (NVVM::GridDimZOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.nctaid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.nsmid (NVVM::SmDimOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.nsmid` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.ntid.x (NVVM::BlockDimXOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.ntid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.ntid.y (NVVM::BlockDimYOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.ntid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.ntid.z (NVVM::BlockDimZOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.ntid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.nwarpid (NVVM::WarpDimOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.nwarpid` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.smid (NVVM::SmIdOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.smid` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.tid.x (NVVM::ThreadIdXOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.tid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.tid.y (NVVM::ThreadIdYOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.tid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.tid.z (NVVM::ThreadIdZOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.tid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.warpid (NVVM::WarpIdOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.warpid` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.warpsize (NVVM::WarpSizeOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.warpsize` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.redux.sync (NVVM::ReduxOp) 

Syntax:

operation ::= `nvvm.redux.sync` $kind $val `,` $mask_and_clamp  attr-dict `:` type($val) `->` type($res)

Attributes: 

AttributeMLIR TypeDescription
kind::mlir::NVVM::ReduxKindAttr
NVVM redux kind

Enum cases:

  • add (ADD)
  • and (AND)
  • max (MAX)
  • min (MIN)
  • or (OR)
  • umax (UMAX)
  • umin (UMIN)
  • xor (XOR)

Operands: 

OperandDescription
valLLVM dialect-compatible type
mask_and_clamp32-bit signless integer

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.setmaxregister (NVVM::SetMaxRegisterOp) 

Syntax:

operation ::= `nvvm.setmaxregister` $action $regCount attr-dict

Attributes: 

AttributeMLIR TypeDescription
regCount::mlir::IntegerAttr32-bit signless integer attribute
action::mlir::NVVM::SetMaxRegisterActionAttr
NVVM set max register action

Enum cases:

  • decrease (decrease)
  • increase (increase)

nvvm.shfl.sync (NVVM::ShflOp) 

NVVM Dialect Op for shfl.sync

Syntax:

operation ::= `nvvm.shfl.sync` $kind $thread_mask `,` $val `,` $offset `,` $mask_and_clamp  attr-dict
              `:` type($val) `->` type($res)

The shfl.sync Op implements data shuffle within threads of a warp. The thread_mask denotes the threads participating in the Op where the bit position corresponds to a particular thread’s laneid. The offset specifies a source lane or source lane offset (depending on kind). The val is the input value to be copied from the source. The mask_and_clamp contains two packed values specifying a mask for logically splitting warps into sub-segments and an upper bound for clamping the source lane index. [For more information, refer PTX ISA] ( https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-shfl-sync)

Attributes: 

AttributeMLIR TypeDescription
kind::mlir::NVVM::ShflKindAttr
NVVM shuffle kind

Enum cases:

  • bfly (bfly)
  • up (up)
  • down (down)
  • idx (idx)
return_value_and_is_valid::mlir::UnitAttrunit attribute

Operands: 

OperandDescription
thread_mask32-bit signless integer
valLLVM dialect-compatible type
offset32-bit signless integer
mask_and_clamp32-bit signless integer

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.stmatrix (NVVM::StMatrixOp) 

Cooperative matrix store

Syntax:

operation ::= `nvvm.stmatrix` $ptr `,` $sources attr-dict `:` type(operands)

Collectively store one or more matrices across all threads in a warp to the location indicated by the address operand $ptr in shared memory. [For more information, see PTX ISA] ( https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-store-instruction-stmatrix)

Interfaces: BasicPtxBuilderInterface

Attributes: 

AttributeMLIR TypeDescription
layout::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)

Operands: 

OperandDescription
ptrLLVM pointer in address space 3
sourcesvariadic of 32-bit signless integer

nvvm.vote.ballot.sync (NVVM::VoteBallotOp) 

Operands: 

OperandDescription
maskLLVM dialect-compatible type
predLLVM dialect-compatible type

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.wgmma.commit.group.sync.aligned (NVVM::WgmmaGroupSyncAlignedOp) 

Syntax:

operation ::= `nvvm.wgmma.commit.group.sync.aligned` attr-dict

Commits all prior uncommitted warpgroup level matrix multiplication operations.

For more information, see PTX ISA

Interfaces: BasicPtxBuilderInterface

nvvm.wgmma.fence.aligned (NVVM::WgmmaFenceAlignedOp) 

Syntax:

operation ::= `nvvm.wgmma.fence.aligned` attr-dict

Enforce an ordering of register accesses between warpgroup level matrix multiplication and other operations.

For more information, see PTX ISA

Interfaces: BasicPtxBuilderInterface

nvvm.wgmma.mma_async (NVVM::WgmmaMmaAsyncOp) 

Syntax:

operation ::= `nvvm.wgmma.mma_async` $descriptorA `,` $descriptorB `,` $inouts `,` $shape `,`
              `D` `[` $typeD `,` $scaleD (`,` $satfinite^)? `]` `,`
              `A` `[` $typeA `,` $scaleA `,` $layoutA `]` `,`
              `B` `[` $typeB `,` $scaleB `,` $layoutB `]`
              attr-dict `:`
              type($inouts) `->` type($results)

The warpgroup (128 threads) level matrix multiply and accumulate operation has either of the following forms, where matrix D is called accumulator: D = A * B + D D = A * B, where the input from accumulator D is disabled.

Supported shapes:

|--------------|--------------|------------|--------------|---------------|
|              |              |            |              |f16+=e4m3*e4m3 |
|              |              |            |              |f16+=e5m2*e5m2 |
|f32+=tf32*tf32|f16+=f16 *f16 | s32+=s8*s8 |s32 += b1 * b1|f16+=e5m2*e4m3 |
|              |f32+=f16 *f16 | s32+=u8*u8 |              |f16+=e4m3*e5m2 |
|              |f32+=bf16*bf16| s32+=u8*u8 |              |f16+=e4m3*e5m2 |
|              |f32+=bf16*bf16| s32+=s8*u8 |              |f32+=e4m3*e4m3 |
|              |              | s32+=u8*s8 |              |f32+=e5m2*e5m2 |
|              |              |            |              |f32+=e4m3*e5m2 |
|              |              |            |              |f32+=e4m3*e5m2 |
|--------------|--------------|------------|--------------|---------------|
|   .m64n8k8   |  .m64n8k16   | .m64n8k32  | .m64n8k256   | .m64n8k32     |
|   .m64n16k8  |  .m64n16k16  | .m64n16k32 | .m64n16k256  | .m64n16k32    |
|   .m64n24k8  |  .m64n24k16  | .m64n24k32 | .m64n24k256  | .m64n24k32    |
|   .m64n32k8  |  .m64n32k16  | .m64n32k32 | .m64n32k256  | .m64n32k32    |
|   .m64n40k8  |  .m64n40k16  | .m64n48k32 | .m64n48k256  | .m64n40k32    |
|   .m64n48k8  |  .m64n48k16  | .m64n64k32 | .m64n64k256  | .m64n48k32    |
|   .m64n56k8  |  .m64n56k16  | .m64n80k32 | .m64n80k256  | .m64n56k32    |
|   .m64n64k8  |  .m64n64k16  | .m64n96k32 | .m64n96k256  | .m64n64k32    |
|   .m64n72k8  |  .m64n72k16  | .m64n112k32| .m64n112k256 | .m64n72k32    |
|   .m64n80k8  |  .m64n80k16  | .m64n128k32| .m64n128k256 | .m64n80k32    |
|   .m64n88k8  |  .m64n88k16  | .m64n144k32| .m64n144k256 | .m64n88k32    |
|   .m64n96k8  |  .m64n96k16  | .m64n160k32| .m64n160k256 | .m64n96k32    |
|   .m64n104k8 |  .m64n104k16 | .m64n176k32| .m64n176k256 | .m64n104k32   |
|   .m64n112k8 |  .m64n112k16 | .m64n192k32| .m64n192k256 | .m64n112k32   |
|   .m64n120k8 |  .m64n120k16 | .m64n208k32| .m64n208k256 | .m64n120k32   |
|   .m64n128k8 |  .m64n128k16 | .m64n224k32| .m64n224k256 | .m64n128k32   |
|   .m64n136k8 |  .m64n136k16 | .m64n240k32| .m64n240k256 | .m64n136k32   |
|   .m64n144k8 |  .m64n144k16 | .m64n256k32| .m64n256k256 | .m64n144k32   |
|   .m64n152k8 |  .m64n152k16 |            |              | .m64n152k32   |
|   .m64n160k8 |  .m64n160k16 |            |              | .m64n160k32   |
|   .m64n168k8 |  .m64n168k16 |            |              | .m64n168k32   |
|   .m64n176k8 |  .m64n176k16 |            |              | .m64n176k32   |
|   .m64n184k8 |  .m64n184k16 |            |              | .m64n184k32   |
|   .m64n192k8 |  .m64n192k16 |            |              | .m64n192k32   |
|   .m64n200k8 |  .m64n200k16 |            |              | .m64n200k32   |
|   .m64n208k8 |  .m64n208k16 |            |              | .m64n208k32   |
|   .m64n216k8 |  .m64n216k16 |            |              | .m64n216k32   |
|   .m64n224k8 |  .m64n224k16 |            |              | .m64n224k32   |
|   .m64n232k8 |  .m64n232k16 |            |              | .m64n232k32   |
|   .m64n240k8 |  .m64n240k16 |            |              | .m64n240k32   |
|   .m64n248k8 |  .m64n248k16 |            |              | .m64n248k32   |
|   .m64n256k8 |  .m64n256k16 |            |              | .m64n256k32   |
|--------------|--------------|------------|--------------|---------------|

For more information, see PTX ISA

Interfaces: BasicPtxBuilderInterface

Attributes: 

AttributeMLIR TypeDescription
shape::mlir::NVVM::MMAShapeAttrAttribute for MMA operation shape.
typeA::mlir::NVVM::WGMMATypesAttr
NVVM WGMMA types

Enum cases:

  • f16 (f16)
  • tf32 (tf32)
  • u8 (u8)
  • s8 (s8)
  • b1 (b1)
  • bf16 (bf16)
  • e4m3 (e4m3)
  • e5m2 (e5m2)
  • f32 (f32)
  • s32 (s32)
typeB::mlir::NVVM::WGMMATypesAttr
NVVM WGMMA types

Enum cases:

  • f16 (f16)
  • tf32 (tf32)
  • u8 (u8)
  • s8 (s8)
  • b1 (b1)
  • bf16 (bf16)
  • e4m3 (e4m3)
  • e5m2 (e5m2)
  • f32 (f32)
  • s32 (s32)
typeD::mlir::NVVM::WGMMATypesAttr
NVVM WGMMA types

Enum cases:

  • f16 (f16)
  • tf32 (tf32)
  • u8 (u8)
  • s8 (s8)
  • b1 (b1)
  • bf16 (bf16)
  • e4m3 (e4m3)
  • e5m2 (e5m2)
  • f32 (f32)
  • s32 (s32)
scaleD::mlir::NVVM::WGMMAScaleOutAttr
WGMMA input predicate

Enum cases:

  • zero (zero)
  • one (one)
scaleA::mlir::NVVM::WGMMAScaleInAttr
WGMMA overflow options

Enum cases:

  • one (one)
  • neg (neg)
scaleB::mlir::NVVM::WGMMAScaleInAttr
WGMMA overflow options

Enum cases:

  • one (one)
  • neg (neg)
layoutA::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)
layoutB::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)
satfinite::mlir::NVVM::MMAIntOverflowAttr
MMA overflow options

Enum cases:

  • satfinite (satfinite)
  • wrapped (wrapped)

Operands: 

OperandDescription
inoutsLLVM structure type
descriptorA64-bit signless integer
descriptorB64-bit signless integer

Results: 

ResultDescription
resultsLLVM structure type

nvvm.wgmma.wait.group.sync.aligned (NVVM::WgmmaWaitGroupSyncOp) 

Syntax:

operation ::= `nvvm.wgmma.wait.group.sync.aligned` attr-dict $group

Signal the completion of a preceding warpgroup operation.

For more information, see PTX ISA

Interfaces: BasicPtxBuilderInterface

Attributes: 

AttributeMLIR TypeDescription
group::mlir::IntegerAttr32-bit signless integer attribute

nvvm.wmma.load (NVVM::WMMALoadOp) 

Warp synchronous matrix load

Syntax:

operation ::= `nvvm.wmma.load` $ptr `,` $stride attr-dict `:` functional-type($ptr, $res)

Attributes: 

AttributeMLIR TypeDescription
m::mlir::IntegerAttr32-bit signless integer attribute
n::mlir::IntegerAttr32-bit signless integer attribute
k::mlir::IntegerAttr32-bit signless integer attribute
layout::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)
eltype::mlir::NVVM::MMATypesAttr
NVVM MMA types

Enum cases:

  • f16 (f16)
  • f32 (f32)
  • tf32 (tf32)
  • bf16 (bf16)
  • s8 (s8)
  • u8 (u8)
  • s32 (s32)
  • s4 (s4)
  • u4 (u4)
  • b1 (b1)
  • f64 (f64)
frag::mlir::NVVM::MMAFragAttr
NVVM MMA frag type

Enum cases:

  • a (a)
  • b (b)
  • c (c)

Operands: 

OperandDescription
ptrLLVM pointer type
stride32-bit signless integer

Results: 

ResultDescription
resLLVM structure type

nvvm.wmma.mma (NVVM::WMMAMmaOp) 

Warp synchronous matrix-multiply accumulate using tensor cores.

Syntax:

operation ::= `nvvm.wmma.mma` $args attr-dict `:` functional-type($args, $res)

Attributes: 

AttributeMLIR TypeDescription
m::mlir::IntegerAttr32-bit signless integer attribute
n::mlir::IntegerAttr32-bit signless integer attribute
k::mlir::IntegerAttr32-bit signless integer attribute
layoutA::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)
layoutB::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)
eltypeA::mlir::NVVM::MMATypesAttr
NVVM MMA types

Enum cases:

  • f16 (f16)
  • f32 (f32)
  • tf32 (tf32)
  • bf16 (bf16)
  • s8 (s8)
  • u8 (u8)
  • s32 (s32)
  • s4 (s4)
  • u4 (u4)
  • b1 (b1)
  • f64 (f64)
eltypeB::mlir::NVVM::MMATypesAttr
NVVM MMA types

Enum cases:

  • f16 (f16)
  • f32 (f32)
  • tf32 (tf32)
  • bf16 (bf16)
  • s8 (s8)
  • u8 (u8)
  • s32 (s32)
  • s4 (s4)
  • u4 (u4)
  • b1 (b1)
  • f64 (f64)

Operands: 

OperandDescription
argsvariadic of LLVM dialect-compatible type

Results: 

ResultDescription
resLLVM structure type

nvvm.wmma.store (NVVM::WMMAStoreOp) 

Warp synchronous matrix store

Syntax:

operation ::= `nvvm.wmma.store` $ptr `,` $stride `,` $args attr-dict `:` qualified(type($ptr)) `,`
              type($args)

Attributes: 

AttributeMLIR TypeDescription
m::mlir::IntegerAttr32-bit signless integer attribute
n::mlir::IntegerAttr32-bit signless integer attribute
k::mlir::IntegerAttr32-bit signless integer attribute
layout::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)
eltype::mlir::NVVM::MMATypesAttr
NVVM MMA types

Enum cases:

  • f16 (f16)
  • f32 (f32)
  • tf32 (tf32)
  • bf16 (bf16)
  • s8 (s8)
  • u8 (u8)
  • s32 (s32)
  • s4 (s4)
  • u4 (u4)
  • b1 (b1)
  • f64 (f64)

Operands: 

OperandDescription
ptrLLVM pointer type
argsvariadic of LLVM dialect-compatible type
stride32-bit signless integer

Attributes 

LoadCacheModifierKindAttr 

NVVM load cache modifier kind

Syntax:

#nvvm.load_cache_modifier<
  ::mlir::NVVM::LoadCacheModifierKind   # value
>

Enum attribute of the different kinds of cache operators for load instructions.

For more information, see PTX ISA

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::LoadCacheModifierKindan enum of type LoadCacheModifierKind

MMAB1OpAttr 

MMA binary operations

Syntax:

#nvvm.mma_b1op<
  ::mlir::NVVM::MMAB1Op   # value
>

Enum cases:

  • none (none)
  • xor_popc (xor_popc)
  • and_popc (and_popc)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::MMAB1Opan enum of type MMAB1Op

MMAFragAttr 

NVVM MMA frag type

Syntax:

#nvvm.mma_frag<
  ::mlir::NVVM::MMAFrag   # value
>

Enum cases:

  • a (a)
  • b (b)
  • c (c)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::MMAFragan enum of type MMAFrag

MMAIntOverflowAttr 

MMA overflow options

Syntax:

#nvvm.mma_int_overflow<
  ::mlir::NVVM::MMAIntOverflow   # value
>

Enum cases:

  • satfinite (satfinite)
  • wrapped (wrapped)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::MMAIntOverflowan enum of type MMAIntOverflow

MMALayoutAttr 

NVVM MMA layout

Syntax:

#nvvm.mma_layout<
  ::mlir::NVVM::MMALayout   # value
>

Enum cases:

  • row (row)
  • col (col)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::MMALayoutan enum of type MMALayout

MMATypesAttr 

NVVM MMA types

Syntax:

#nvvm.mma_type<
  ::mlir::NVVM::MMATypes   # value
>

Enum cases:

  • f16 (f16)
  • f32 (f32)
  • tf32 (tf32)
  • bf16 (bf16)
  • s8 (s8)
  • u8 (u8)
  • s32 (s32)
  • s4 (s4)
  • u4 (u4)
  • b1 (b1)
  • f64 (f64)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::MMATypesan enum of type MMATypes

MemScopeKindAttr 

NVVM Memory Scope kind

Syntax:

#nvvm.mem_scope<
  ::mlir::NVVM::MemScopeKind   # value
>

Enum cases:

  • cta (CTA)
  • cluster (CLUSTER)
  • gpu (GPU)
  • sys (SYS)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::MemScopeKindan enum of type MemScopeKind

MMAShapeAttr 

Attribute for MMA operation shape.

Syntax:

#nvvm.shape<
  int,   # m
  int,   # n
  int   # k
>

Parameters: 

ParameterC++ typeDescription
mint
nint
kint

NVVMTargetAttr 

Syntax:

#nvvm.target<
  int,   # O
  ::llvm::StringRef,   # triple
  ::llvm::StringRef,   # chip
  ::llvm::StringRef,   # features
  DictionaryAttr,   # flags
  ArrayAttr   # link
>

GPU target attribute for controlling compilation of NVIDIA targets. All parameters decay into default values if not present.

Examples:

  1. Target with default values.
  gpu.module @mymodule [#nvvm.target] attributes {...} {
    ...
  }
  1. Target with sm_90 chip and fast math.
  gpu.module @mymodule [#nvvm.target<chip = "sm_90", flags = {fast}>] {
    ...
  }

Parameters: 

ParameterC++ typeDescription
OintOptimization level to apply.
triple::llvm::StringRefTarget triple.
chip::llvm::StringRefTarget chip.
features::llvm::StringRefTarget chip features.
flagsDictionaryAttrTarget specific flags.
linkArrayAttrFiles to link to the LLVM module.

ProxyKindAttr 

Proxy kind

Syntax:

#nvvm.proxy_kind<
  ::mlir::NVVM::ProxyKind   # value
>

Enum cases:

  • alias (alias)
  • async (async)
  • async.global (async_global)
  • async.shared (async_shared)
  • tensormap (TENSORMAP)
  • generic (GENERIC)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::ProxyKindan enum of type ProxyKind

ReduxKindAttr 

NVVM redux kind

Syntax:

#nvvm.redux_kind<
  ::mlir::NVVM::ReduxKind   # value
>

Enum cases:

  • add (ADD)
  • and (AND)
  • max (MAX)
  • min (MIN)
  • or (OR)
  • umax (UMAX)
  • umin (UMIN)
  • xor (XOR)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::ReduxKindan enum of type ReduxKind

SetMaxRegisterActionAttr 

NVVM set max register action

Syntax:

#nvvm.action<
  ::mlir::NVVM::SetMaxRegisterAction   # value
>

Enum cases:

  • decrease (decrease)
  • increase (increase)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::SetMaxRegisterActionan enum of type SetMaxRegisterAction

SharedSpaceAttr 

Shared memory space

Syntax:

#nvvm.shared_space<
  ::mlir::NVVM::SharedSpace   # value
>

Enum cases:

  • cta (shared_cta)
  • cluster (shared_cluster)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::SharedSpacean enum of type SharedSpace

ShflKindAttr 

NVVM shuffle kind

Syntax:

#nvvm.shfl_kind<
  ::mlir::NVVM::ShflKind   # value
>

Enum cases:

  • bfly (bfly)
  • up (up)
  • down (down)
  • idx (idx)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::ShflKindan enum of type ShflKind

TMAReduxKindAttr 

NVVM TMA redux kind

Syntax:

#nvvm.tma_redux_kind<
  ::mlir::NVVM::TMAReduxKind   # value
>

Enum cases:

  • add (ADD)
  • max (MAX)
  • min (MIN)
  • inc (INC)
  • dec (DEC)
  • and (AND)
  • or (OR)
  • xor (XOR)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::TMAReduxKindan enum of type TMAReduxKind

TMAStoreModeAttr 

NVVM TMA Store Mode

Syntax:

#nvvm.tma_store_mode<
  ::mlir::NVVM::TMAStoreMode   # value
>

Enum cases:

  • tile (TILE)
  • im2col (IM2COL)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::TMAStoreModean enum of type TMAStoreMode

WGMMAScaleInAttr 

WGMMA overflow options

Syntax:

#nvvm.wgmma_scale_in<
  ::mlir::NVVM::WGMMAScaleIn   # value
>

Enum cases:

  • one (one)
  • neg (neg)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::WGMMAScaleInan enum of type WGMMAScaleIn

WGMMAScaleOutAttr 

WGMMA input predicate

Syntax:

#nvvm.wgmma_scale_out<
  ::mlir::NVVM::WGMMAScaleOut   # value
>

Enum cases:

  • zero (zero)
  • one (one)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::WGMMAScaleOutan enum of type WGMMAScaleOut

WGMMATypesAttr 

NVVM WGMMA types

Syntax:

#nvvm.wgmma_type<
  ::mlir::NVVM::WGMMATypes   # value
>

Enum cases:

  • f16 (f16)
  • tf32 (tf32)
  • u8 (u8)
  • s8 (s8)
  • b1 (b1)
  • bf16 (bf16)
  • e4m3 (e4m3)
  • e5m2 (e5m2)
  • f32 (f32)
  • s32 (s32)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::WGMMATypesan enum of type WGMMATypes

Enums 

LoadCacheModifierKind 

NVVM load cache modifier kind

Cases: 

SymbolValueString
CA0ca
CG1cg
CS2cs
LU3lu
CV4cv

MMAB1Op 

MMA binary operations

Cases: 

SymbolValueString
none0none
xor_popc1xor_popc
and_popc2and_popc

MMAFrag 

NVVM MMA frag type

Cases: 

SymbolValueString
a0a
b1b
c2c

MMAIntOverflow 

MMA overflow options

Cases: 

SymbolValueString
satfinite1satfinite
wrapped0wrapped

MMALayout 

NVVM MMA layout

Cases: 

SymbolValueString
row0row
col1col

MMATypes 

NVVM MMA types

Cases: 

SymbolValueString
f160f16
f321f32
tf322tf32
bf169bf16
s84s8
u83u8
s325s32
s48s4
u47u4
b16b1
f6410f64

MemScopeKind 

NVVM Memory Scope kind

Cases: 

SymbolValueString
CTA0cta
CLUSTER1cluster
GPU2gpu
SYS3sys

ProxyKind 

Proxy kind

Cases: 

SymbolValueString
alias0alias
async1async
async_global2async.global
async_shared3async.shared
TENSORMAP4tensormap
GENERIC5generic

ReduxKind 

NVVM redux kind

Cases: 

SymbolValueString
ADD1add
AND2and
MAX3max
MIN4min
OR5or
UMAX6umax
UMIN7umin
XOR8xor

SetMaxRegisterAction 

NVVM set max register action

Cases: 

SymbolValueString
decrease1decrease
increase0increase

SharedSpace 

Shared memory space

Cases: 

SymbolValueString
shared_cta0cta
shared_cluster1cluster

ShflKind 

NVVM shuffle kind

Cases: 

SymbolValueString
bfly0bfly
up1up
down2down
idx3idx

TMAReduxKind 

NVVM TMA redux kind

Cases: 

SymbolValueString
ADD0add
MAX2max
MIN1min
INC3inc
DEC4dec
AND5and
OR6or
XOR7xor

TMAStoreMode 

NVVM TMA Store Mode

Cases: 

SymbolValueString
TILE0tile
IM2COL1im2col

WGMMAScaleIn 

WGMMA overflow options

Cases: 

SymbolValueString
one1one
neg-1neg

WGMMAScaleOut 

WGMMA input predicate

Cases: 

SymbolValueString
zero0zero
one1one

WGMMATypes 

NVVM WGMMA types

Cases: 

SymbolValueString
f160f16
tf321tf32
u82u8
s83s8
b14b1
bf165bf16
e4m36e4m3
e5m27e5m2
f328f32
s329s32