MLIR

Multi-Level IR Compiler Framework

'nvvm' Dialect

Operations 

source

nvvm.bar.warp.sync (NVVM::SyncWarpOp) 

Syntax:

operation ::= `nvvm.bar.warp.sync` $mask attr-dict `:` type($mask)

Operands: 

OperandDescription
maskLLVM dialect-compatible type

nvvm.barrier (NVVM::BarrierOp) 

Syntax:

operation ::= `nvvm.barrier` (`id` `=` $barrierId^)? (`number_of_threads` `=` $numberOfThreads^)? attr-dict

Traits: AttrSizedOperandSegments

Operands: 

OperandDescription
barrierId32-bit signless integer
numberOfThreads32-bit signless integer

nvvm.barrier.arrive (NVVM::BarrierArriveOp) 

Syntax:

operation ::= `nvvm.barrier.arrive` (`id` `=` $barrierId^)? `number_of_threads` `=` $numberOfThreads attr-dict

Thread that executes this op announces their arrival at the barrier with given id and continue their execution.

The default barrier id is 0 that is similar to nvvm.barrier Op. When barrierId is not present, the default barrier id is used.

For more information, see PTX ISA

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
barrierId32-bit signless integer
numberOfThreads32-bit signless integer

nvvm.barrier0 (NVVM::Barrier0Op) 

Syntax:

operation ::= `nvvm.barrier0` attr-dict

nvvm.breakpoint (NVVM::Breakpoint) 

Breakpoint Op

Syntax:

operation ::= `nvvm.breakpoint` attr-dict

Breakpoint suspends execution of the program for debugging. For more information, see PTX ISA

nvvm.cluster.arrive (NVVM::ClusterArriveOp) 

Cluster Barrier Arrive Op

Syntax:

operation ::= `nvvm.cluster.arrive` attr-dict

The cluster.arrive can be used by the threads within the cluster for synchronization and communication. The cluster.arrive instruction marks the warps’ arrival at the barrier without causing the executing thread to wait for other participating threads.

The aligned attribute, when provided, generates the .aligned version of the PTX instruction.

For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
aligned::mlir::UnitAttrunit attribute

nvvm.cluster.arrive.relaxed (NVVM::ClusterArriveRelaxedOp) 

Cluster Barrier Relaxed Arrive Op

Syntax:

operation ::= `nvvm.cluster.arrive.relaxed` attr-dict

The cluster.arrive can be used by the threads within the cluster for synchronization and communication. The cluster.arrive instruction marks the warps’ arrival at the barrier without causing the executing thread to wait for other participating threads.

The aligned attribute, when provided, generates the .aligned version of the PTX instruction. The .relaxed qualifier on cluster.arrive specifies that there are no memory ordering and visibility guarantees provided for the memory accesses performed prior to cluster.arrive.

For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
aligned::mlir::UnitAttrunit attribute

nvvm.cluster.wait (NVVM::ClusterWaitOp) 

Cluster Barrier Wait Op

Syntax:

operation ::= `nvvm.cluster.wait` attr-dict

The cluster.wait causes the executing thread to wait for all non-exited threads of the cluster to perform cluster.arrive. The aligned attribute, when provided, generates the .aligned version of the PTX instruction.

For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
aligned::mlir::UnitAttrunit attribute

nvvm.cp.async.bulk.commit.group (NVVM::CpAsyncBulkCommitGroupOp) 

Syntax:

operation ::= `nvvm.cp.async.bulk.commit.group` attr-dict

This Op commits all prior initiated but uncommitted cp.async.bulk instructions into a cp.async.bulk-group.

For more information, see PTX ISA

nvvm.cp.async.bulk.global.shared.cta (NVVM::CpAsyncBulkSharedCTAToGlobalOp) 

Async bulk copy from Shared CTA memory to Global memory

Syntax:

operation ::= `nvvm.cp.async.bulk.global.shared.cta` $dstMem `,` $srcMem `,` $size
              (`l2_cache_hint` `=` $l2CacheHint^ )?
              attr-dict  `:` type($dstMem) `,` type($srcMem)

Initiates an asynchronous copy operation from Shared CTA memory to global memory.

The l2CacheHint operand is optional, and it is used to specify cache eviction policy that may be used during the memory access.

For more information, see PTX ISA

Operands: 

OperandDescription
dstMemLLVM pointer in address space 1
srcMemLLVM pointer in address space 3
size32-bit signless integer
l2CacheHint64-bit signless integer

nvvm.cp.async.bulk.shared.cluster.global (NVVM::CpAsyncBulkGlobalToSharedClusterOp) 

Async bulk copy from global memory to Shared cluster memory

Syntax:

operation ::= `nvvm.cp.async.bulk.shared.cluster.global` $dstMem `,` $srcMem `,` $mbar `,` $size
              (`multicast_mask` `=` $multicastMask^ )?
              (`l2_cache_hint` `=` $l2CacheHint^ )?
              attr-dict  `:` type($dstMem) `,` type($srcMem)

Initiates an asynchronous copy operation from global memory to cluster’s shared memory.

The multicastMask operand is optional. When it is present, the Op copies data from global memory to shared memory of multiple CTAs in the cluster. Operand multicastMask specifies the destination CTAs in the cluster such that each bit position in the 16-bit multicastMask operand corresponds to the nvvm.read.ptx.sreg.ctaid of the destination CTA.

The l2CacheHint operand is optional, and it is used to specify cache eviction policy that may be used during the memory access.

For more information, see PTX ISA

Traits: AttrSizedOperandSegments

Operands: 

OperandDescription
dstMemLLVM pointer in address space 3
srcMemLLVM pointer in address space 1
mbarLLVM pointer in address space 3
size32-bit signless integer
multicastMask16-bit signless integer
l2CacheHint64-bit signless integer

nvvm.cp.async.bulk.shared.cluster.shared.cta (NVVM::CpAsyncBulkSharedCTAToSharedClusterOp) 

Async bulk copy from Shared CTA memory to Shared cluster memory

Syntax:

operation ::= `nvvm.cp.async.bulk.shared.cluster.shared.cta` $dstMem `,` $srcMem `,` $mbar `,` $size
              attr-dict  `:` type($dstMem) `,` type($srcMem)

Initiates an asynchronous copy operation from Shared CTA memory to Shared cluster memory.

For more information, see PTX ISA

Operands: 

OperandDescription
dstMemLLVM pointer in address space 3
srcMemLLVM pointer in address space 3
mbarLLVM pointer in address space 3
size32-bit signless integer

nvvm.cp.async.bulk.tensor.global.shared.cta (NVVM::CpAsyncBulkTensorSharedCTAToGlobalOp) 

Syntax:

operation ::= `nvvm.cp.async.bulk.tensor.global.shared.cta` $tmaDescriptor `,`
              $srcMem `,`
              `box` `[`$coordinates `]`
              (`,` `predicate` `=` $predicate^)?
              attr-dict  `:` type(operands)

Traits: AttrSizedOperandSegments

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
tmaDescriptorLLVM pointer type
srcMemLLVM pointer in address space 3
coordinatesvariadic of 32-bit signless integer
predicate1-bit signless integer

nvvm.cp.async.bulk.tensor.prefetch (NVVM::CpAsyncBulkTensorPrefetchOp) 

Syntax:

operation ::= `nvvm.cp.async.bulk.tensor.prefetch` $tmaDescriptor `,`
              `box` `[`$coordinates `]`
              (`im2col` `[` $im2colOffsets^ `]` )?
              (`l2_cache_hint` `=` $l2CacheHint^ )?
              attr-dict  `:` type($tmaDescriptor)

Initiates an asynchronous prefetch operation on the tensor data from global memory to L2 cache.

The Op has two modes:

  1. Tiled Mode: It’s the default mode. The source multi-dimensional tensor layout is preserved at the destination.

  2. Im2col Mode: This mode is used when im2colOffsets operands are present. the elements in the Bounding Box of the source tensor are rearranged into columns at the destination. In this mode, the tensor has to be at least 3-dimensional.

The l2CacheHint operand is optional, and it is used to specify cache eviction policy that may be used during the memory access.

For more information, see PTX ISA

Traits: AttrSizedOperandSegments

Operands: 

OperandDescription
tmaDescriptorLLVM pointer type
coordinatesvariadic of 32-bit signless integer
im2colOffsetsvariadic of 16-bit signless integer
l2CacheHint64-bit signless integer

nvvm.cp.async.bulk.tensor.reduce (NVVM::CpAsyncBulkTensorReduceOp) 

Syntax:

operation ::= `nvvm.cp.async.bulk.tensor.reduce` $tmaDescriptor `,`
              $srcMem `,`
              `box` `[`$coordinates `]`
              (`l2_cache_hint` `=` $l2CacheHint^ )?
              attr-dict  `:` type($tmaDescriptor) `,` type($srcMem)

Initiates an asynchronous reduction operation of tensor data in global memory with tensor data in shared memory.

The mode attribute indicates whether the copy mode is tile or im2col. The redOp attribute specifies the reduction operations applied. The supported reduction operations are: {add, min, max, inc, dec, and, or, xor}

The l2CacheHint operand is optional, and it is used to specify cache eviction policy that may be used during the memory access.

For more information, see PTX ISA

Traits: AttrSizedOperandSegments

Attributes: 

AttributeMLIR TypeDescription
redKind::mlir::NVVM::TMAReduxKindAttr
NVVM TMA redux kind

Enum cases:

  • add (ADD)
  • max (MAX)
  • min (MIN)
  • inc (INC)
  • dec (DEC)
  • and (AND)
  • or (OR)
  • xor (XOR)
mode::mlir::NVVM::TMAStoreModeAttr
NVVM TMA Store Mode

Enum cases:

  • tile (TILE)
  • im2col (IM2COL)

Operands: 

OperandDescription
tmaDescriptorLLVM pointer type
srcMemLLVM pointer in address space 3
coordinatesvariadic of 32-bit signless integer
l2CacheHint64-bit signless integer

nvvm.cp.async.bulk.tensor.shared.cluster.global (NVVM::CpAsyncBulkTensorGlobalToSharedClusterOp) 

Syntax:

operation ::= `nvvm.cp.async.bulk.tensor.shared.cluster.global` $dstMem `,`
              $tmaDescriptor `,`
              $mbar `,`
              `box` `[`$coordinates `]`
              (`im2col` `[` $im2colOffsets^ `]` )?
              (`multicast_mask` `=` $multicastMask^ )?
              (`l2_cache_hint` `=` $l2CacheHint^ )?
              (`predicate` `=` $predicate^)?
              attr-dict  `:` type($dstMem) `,` type($tmaDescriptor)

Initiates an asynchronous copy operation on the tensor data from global memory to shared memory.

The Op operates has two load modes:

  1. Tiled Mode: It’s the default mode. The source multi-dimensional tensor layout is preserved at the destination.

  2. Im2col Mode: This mode is used when im2colOffsets operands are present. the elements in the Bounding Box of the source tensor are rearranged into columns at the destination. In this mode, the tensor has to be at least 3-dimensional.

The multicastMask operand is optional. When it is present, the Op copies data from global memory to shared memory of multiple CTAs in the cluster. Operand multicastMask specifies the destination CTAs in the cluster such that each bit position in the 16-bit multicastMask operand corresponds to the nvvm.read.ptx.sreg.ctaid of the destination CTA.

The l2CacheHint operand is optional, and it is used to specify cache eviction policy that may be used during the memory access.

For more information, see PTX ISA

Traits: AttrSizedOperandSegments

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
dstMemLLVM pointer in address space 3
tmaDescriptorLLVM pointer type
coordinatesvariadic of 32-bit signless integer
mbarLLVM pointer in address space 3
im2colOffsetsvariadic of 16-bit signless integer
multicastMask16-bit signless integer
l2CacheHint64-bit signless integer
predicate1-bit signless integer

nvvm.cp.async.bulk.wait_group (NVVM::CpAsyncBulkWaitGroupOp) 

Syntax:

operation ::= `nvvm.cp.async.bulk.wait_group` $group attr-dict

Op waits for completion of the most recent bulk async-groups.

The $group operand tells waiting has to be done until for $group or fewer of the most recent bulk async-groups. If $group is 0, the op wait until all the most recent bulk async-groups have completed.

The $read indicates that the waiting has to be done until all the bulk async operations in the specified bulk async-group have completed reading from their source locations.

For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
group::mlir::IntegerAttr32-bit signless integer attribute whose minimum value is 0
read::mlir::UnitAttrunit attribute

nvvm.cp.async.commit.group (NVVM::CpAsyncCommitGroupOp) 

Syntax:

operation ::= `nvvm.cp.async.commit.group` attr-dict

nvvm.cp.async.mbarrier.arrive (NVVM::CpAsyncMBarrierArriveOp) 

NVVM Dialect Op for cp.async.mbarrier.arrive

Syntax:

operation ::= `nvvm.cp.async.mbarrier.arrive` $addr attr-dict `:` type(operands)

The cp.async.mbarrier.arrive Op makes the mbarrier object track all prior cp.async operations initiated by the executing thread. The addr operand specifies the address of the mbarrier object in generic address space. The noinc attr impacts how the mbarrier’s state is updated.

For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
noinc::mlir::IntegerAttr1-bit signless integer attribute

Operands: 

OperandDescription
addrLLVM pointer type

nvvm.cp.async.mbarrier.arrive.shared (NVVM::CpAsyncMBarrierArriveSharedOp) 

NVVM Dialect Op for cp.async.mbarrier.arrive.shared

Syntax:

operation ::= `nvvm.cp.async.mbarrier.arrive.shared` $addr attr-dict `:` type(operands)

The cp.async.mbarrier.arrive.shared Op makes the mbarrier object track all prior cp.async operations initiated by the executing thread. The addr operand specifies the address of the mbarrier object in shared memory. The noinc attr impacts how the mbarrier’s state is updated.

For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
noinc::mlir::IntegerAttr1-bit signless integer attribute

Operands: 

OperandDescription
addrLLVM pointer in address space 3

nvvm.cp.async.shared.global (NVVM::CpAsyncOp) 

Syntax:

operation ::= `nvvm.cp.async.shared.global` $dst `,` $src `,` $size `,` `cache` `=` $modifier (`,` $cpSize^)? attr-dict `:` type(operands)

Attributes: 

AttributeMLIR TypeDescription
size::mlir::IntegerAttr32-bit signless integer attribute
modifier::mlir::NVVM::LoadCacheModifierKindAttr
NVVM load cache modifier kind
Enum attribute of the different kinds of cache operators for load instructions.

For more information, see PTX ISA

Operands: 

OperandDescription
dstLLVM pointer in address space 3
srcLLVM pointer in address space 1
cpSizeLLVM dialect-compatible type

nvvm.cp.async.wait.group (NVVM::CpAsyncWaitGroupOp) 

Syntax:

operation ::= `nvvm.cp.async.wait.group` $n attr-dict

Attributes: 

AttributeMLIR TypeDescription
n::mlir::IntegerAttr32-bit signless integer attribute

nvvm.cvt.float.to.tf32 (NVVM::CvtFloatToTF32Op) 

Convert the given float input to TF32

Syntax:

operation ::= `nvvm.cvt.float.to.tf32` $src attr-dict

This Op converts the given f32 input to tf32. The result res is represented as an i32 type. The relu attribute, when set, lowers to the ‘.relu’ variant of the cvt instruction. The rnd and sat attributes specify the the rounding and saturation modes respectively.

For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
rnd::mlir::NVVM::FPRoundingModeAttr
NVVM FPRoundingMode kind

Enum cases:

  • none (NONE)
  • rn (RN)
  • rm (RM)
  • rp (RP)
  • rz (RZ)
  • rna (RNA)
sat::mlir::NVVM::SaturationModeAttr
NVVM SaturationMode kind

Enum cases:

  • none (NONE)
  • satfinite (SATFINITE)
relu::mlir::BoolAttrbool attribute

Operands: 

OperandDescription
src32-bit float

Results: 

ResultDescription
res32-bit signless integer

nvvm.elect.sync (NVVM::ElectSyncOp) 

Elect one leader thread

Syntax:

operation ::= `nvvm.elect.sync` attr-dict `->` type(results)

The elect.sync instruction elects one predicated active leader thread from among a set of threads specified in membermask. The membermask is set to 0xFFFFFFFF for the current version of this Op. The predicate result is set to True for the leader thread, and False for all other threads.

For more information, see PTX ISA

Results: 

ResultDescription
pred1-bit signless integer

nvvm.exit (NVVM::Exit) 

Exit Op

Syntax:

operation ::= `nvvm.exit` attr-dict

Ends execution of a thread. For more information, see PTX ISA

nvvm.fence.mbarrier.init (NVVM::FenceMbarrierInitOp) 

Syntax:

operation ::= `nvvm.fence.mbarrier.init` attr-dict

Fence operation that applies on the prior nvvm.mbarrier.init

For more information, see PTX ISA

Interfaces: BasicPtxBuilderInterface

nvvm.fence.proxy (NVVM::FenceProxyOp) 

Syntax:

operation ::= `nvvm.fence.proxy` attr-dict

Fence operation with proxy to establish an ordering between memory accesses that may happen through different proxies.

For more information, see PTX ISA

Interfaces: BasicPtxBuilderInterface

Attributes: 

AttributeMLIR TypeDescription
kind::mlir::NVVM::ProxyKindAttr
Proxy kind

Enum cases:

  • alias (alias)
  • async (async)
  • async.global (async_global)
  • async.shared (async_shared)
  • tensormap (TENSORMAP)
  • generic (GENERIC)
space::mlir::NVVM::SharedSpaceAttr
Shared memory space

Enum cases:

  • cta (shared_cta)
  • cluster (shared_cluster)

nvvm.fence.proxy.acquire (NVVM::FenceProxyAcquireOp) 

Uni-directional proxy fence operation with acquire semantics

Syntax:

operation ::= `nvvm.fence.proxy.acquire` $scope $addr `,` $size (`from_proxy` `=` $fromProxy^)? (`to_proxy` `=` $toProxy^)? attr-dict

fence.proxy.acquire is a uni-directional fence used to establish ordering between a prior memory access performed via the generic proxy and a subsequent memory access performed via the tensormap proxy

The address operand addr and the operand size together specify the memory range [addr, addr+size) on which the ordering guarantees on the memory accesses across the proxies is to be provided. The only supported value for the size operand is 128 and must be an immediate. Generic Addressing is used unconditionally, and the address specified by the operand addr must fall within the .global state space. Otherwise, the behavior is undefined

For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
scope::mlir::NVVM::MemScopeKindAttr
NVVM Memory Scope kind

Enum cases:

  • cta (CTA)
  • cluster (CLUSTER)
  • gpu (GPU)
  • sys (SYS)
fromProxy::mlir::NVVM::ProxyKindAttr
Proxy kind

Enum cases:

  • alias (alias)
  • async (async)
  • async.global (async_global)
  • async.shared (async_shared)
  • tensormap (TENSORMAP)
  • generic (GENERIC)
toProxy::mlir::NVVM::ProxyKindAttr
Proxy kind

Enum cases:

  • alias (alias)
  • async (async)
  • async.global (async_global)
  • async.shared (async_shared)
  • tensormap (TENSORMAP)
  • generic (GENERIC)

Operands: 

OperandDescription
addrLLVM pointer in address space 0
size32-bit signless integer

nvvm.fence.proxy.release (NVVM::FenceProxyReleaseOp) 

Uni-directional proxy fence operation with release semantics

Syntax:

operation ::= `nvvm.fence.proxy.release` $scope (`from_proxy` `=` $fromProxy^)? (`to_proxy` `=` $toProxy^)? attr-dict

fence.proxy.release is a uni-directional fence used to establish ordering between a prior memory access performed via the generic proxy and a subsequent memory access performed via the tensormap proxy. fence.proxy.release operation can form a release sequence that synchronizes with an acquire sequence that contains the fence.proxy.acquire proxy fence operation

For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
scope::mlir::NVVM::MemScopeKindAttr
NVVM Memory Scope kind

Enum cases:

  • cta (CTA)
  • cluster (CLUSTER)
  • gpu (GPU)
  • sys (SYS)
fromProxy::mlir::NVVM::ProxyKindAttr
Proxy kind

Enum cases:

  • alias (alias)
  • async (async)
  • async.global (async_global)
  • async.shared (async_shared)
  • tensormap (TENSORMAP)
  • generic (GENERIC)
toProxy::mlir::NVVM::ProxyKindAttr
Proxy kind

Enum cases:

  • alias (alias)
  • async (async)
  • async.global (async_global)
  • async.shared (async_shared)
  • tensormap (TENSORMAP)
  • generic (GENERIC)

nvvm.fence.sc.cluster (NVVM::FenceScClusterOp) 

Syntax:

operation ::= `nvvm.fence.sc.cluster` attr-dict

nvvm.griddepcontrol.launch.dependents (NVVM::GriddepcontrolLaunchDependentsOp) 

Syntax:

operation ::= `nvvm.griddepcontrol.launch.dependents` attr-dict

Signals that specific dependents the runtime system designated to react to this instruction can be scheduled as soon as all other CTAs in the grid issue the same instruction or have completed.

For more information, see PTX ISA

nvvm.griddepcontrol.wait (NVVM::GriddepcontrolWaitOp) 

Syntax:

operation ::= `nvvm.griddepcontrol.wait` attr-dict

Causes the executing thread to wait until all prerequisite grids in flight have completed and all the memory operations from the prerequisite grids are performed and made visible to the current grid.

For more information, see PTX ISA

nvvm.ldmatrix (NVVM::LdMatrixOp) 

Cooperative matrix load

Syntax:

operation ::= `nvvm.ldmatrix` $ptr attr-dict `:` functional-type($ptr, $res)

Attributes: 

AttributeMLIR TypeDescription
num::mlir::IntegerAttr32-bit signless integer attribute
layout::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)

Operands: 

OperandDescription
ptrLLVM pointer type

Results: 

ResultDescription
resany type

nvvm.mapa (NVVM::MapaOp) 

Syntax:

operation ::= `nvvm.mapa` $a`,` $b attr-dict `:` type($a) `->` type($res)

Operands: 

OperandDescription
aLLVM pointer in address space 0 or LLVM pointer in address space 3
b32-bit signless integer

Results: 

ResultDescription
resLLVM pointer in address space 0 or LLVM pointer in address space 3

nvvm.match.sync (NVVM::MatchSyncOp) 

Broadcast and compare a value across threads in warp

Syntax:

operation ::= `nvvm.match.sync` $kind $thread_mask `,` $val attr-dict `:` type($val) `->` type($res)

The match.sync op performs broadcast and compare of operand val across all non-exited threads in thread_mask and returns a mask depending on the kind and an optional predicate.

The matching operation kinds are:

  • any: Returns a mask corresponding to the non-exited threads in the thread_mask that have the same value of operand val.
  • all: Returns a mask and a predicate. If all non-exited threads in the thread_mask have the same value of operand val, the predicate is set to true and the mask corresponds to the non-exited threads in the thread_mask. Otherwise, the predicate is set to false and the mask is 0.

For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
kind::mlir::NVVM::MatchSyncKindAttr
NVVM match sync kind

Enum cases:

  • any (any)
  • all (all)

Operands: 

OperandDescription
thread_mask32-bit signless integer
val32-bit signless integer or 64-bit signless integer

Results: 

ResultDescription
res32-bit signless integer or LLVM struct type

nvvm.mbarrier.arrive (NVVM::MBarrierArriveOp) 

Syntax:

operation ::= `nvvm.mbarrier.arrive` $addr attr-dict `:` type($addr) `->` type($res)

Operands: 

OperandDescription
addrLLVM pointer type

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.mbarrier.arrive.expect_tx (NVVM::MBarrierArriveExpectTxOp) 

Syntax:

operation ::= `nvvm.mbarrier.arrive.expect_tx` $addr `,` $txcount (`,` `predicate` `=` $predicate^)? attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
addrLLVM pointer type
txcount32-bit signless integer
predicate1-bit signless integer

nvvm.mbarrier.arrive.expect_tx.shared (NVVM::MBarrierArriveExpectTxSharedOp) 

Syntax:

operation ::= `nvvm.mbarrier.arrive.expect_tx.shared` $addr `,` $txcount (`,` `predicate` `=` $predicate^)? attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
addrLLVM pointer in address space 3
txcount32-bit signless integer
predicate1-bit signless integer

nvvm.mbarrier.arrive.nocomplete (NVVM::MBarrierArriveNocompleteOp) 

Syntax:

operation ::= `nvvm.mbarrier.arrive.nocomplete` $addr `,` $count attr-dict `:` type(operands) `->` type($res)

Operands: 

OperandDescription
addrLLVM pointer type
count32-bit signless integer

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.mbarrier.arrive.nocomplete.shared (NVVM::MBarrierArriveNocompleteSharedOp) 

Syntax:

operation ::= `nvvm.mbarrier.arrive.nocomplete.shared` $addr `,` $count attr-dict `:` type(operands) `->` type($res)

Operands: 

OperandDescription
addrLLVM pointer in address space 3
count32-bit signless integer

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.mbarrier.arrive.shared (NVVM::MBarrierArriveSharedOp) 

Syntax:

operation ::= `nvvm.mbarrier.arrive.shared` $addr attr-dict `:` qualified(type($addr)) `->` type($res)

Operands: 

OperandDescription
addrLLVM pointer in address space 3

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.mbarrier.init (NVVM::MBarrierInitOp) 

Syntax:

operation ::= `nvvm.mbarrier.init` $addr `,` $count (`,` `predicate` `=` $predicate^)? attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
addrLLVM pointer type
count32-bit signless integer
predicate1-bit signless integer

nvvm.mbarrier.init.shared (NVVM::MBarrierInitSharedOp) 

Syntax:

operation ::= `nvvm.mbarrier.init.shared` $addr `,` $count (`,` `predicate` `=` $predicate^)? attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
addrLLVM pointer in address space 3
count32-bit signless integer
predicate1-bit signless integer

nvvm.mbarrier.inval (NVVM::MBarrierInvalOp) 

Syntax:

operation ::= `nvvm.mbarrier.inval` $addr attr-dict `:` type(operands)

Operands: 

OperandDescription
addrLLVM pointer type

nvvm.mbarrier.inval.shared (NVVM::MBarrierInvalSharedOp) 

Syntax:

operation ::= `nvvm.mbarrier.inval.shared` $addr attr-dict `:` type(operands)

Operands: 

OperandDescription
addrLLVM pointer in address space 3

nvvm.mbarrier.test.wait (NVVM::MBarrierTestWaitOp) 

Syntax:

operation ::= `nvvm.mbarrier.test.wait` $addr `,` $state attr-dict `:` type(operands) `->` type($res)

Operands: 

OperandDescription
addrLLVM pointer type
stateLLVM dialect-compatible type

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.mbarrier.test.wait.shared (NVVM::MBarrierTestWaitSharedOp) 

Syntax:

operation ::= `nvvm.mbarrier.test.wait.shared` $addr `,` $state attr-dict `:` type(operands) `->` type($res)

Operands: 

OperandDescription
addrLLVM pointer in address space 3
stateLLVM dialect-compatible type

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.mbarrier.try_wait.parity (NVVM::MBarrierTryWaitParityOp) 

Syntax:

operation ::= `nvvm.mbarrier.try_wait.parity` $addr `,` $phase `,` $ticks attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
addrLLVM pointer type
phase32-bit signless integer
ticks32-bit signless integer

nvvm.mbarrier.try_wait.parity.shared (NVVM::MBarrierTryWaitParitySharedOp) 

Syntax:

operation ::= `nvvm.mbarrier.try_wait.parity.shared` $addr `,` $phase `,` $ticks attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
addrLLVM pointer in address space 3
phase32-bit signless integer
ticks32-bit signless integer

nvvm.mma.sync (NVVM::MmaOp) 

Cooperative matrix-multiply and accumulate

The nvvm.mma.sync operation collectively performs the operation D = matmul(A, B) + C using all threads in a warp.

All the threads in the warp must execute the same mma.sync operation.

For each possible multiplicand PTX data type, there are one or more possible instruction shapes given as “mMnNkK”. The below table describes the posssibilities as well as the types required for the operands. Note that the data type for C (the accumulator) and D (the result) can vary independently when there are multiple possibilities in the “C/D Type” column.

When an optional attribute cannot be immediately inferred from the types of the operands and the result during parsing or validation, an error will be raised.

b1Op is only relevant when the binary (b1) type is given to multiplicandDataType. It specifies how the multiply-and-acumulate is performed and is either xor_popc or and_poc. The default is xor_popc.

intOverflowBehavior is only relevant when the multiplicandType attribute is one of u8, s8, u4, s4, this attribute describes how overflow is handled in the accumulator. When the attribute is satfinite, the accumulator values are clamped in the int32 range on overflow. This is the default behavior. Alternatively, accumulator behavior wrapped can also be specified, in which case overflow wraps from one end of the range to the other.

layoutA and layoutB are required and should generally be set to #nvvm.mma_layout<row> and #nvvm.mma_layout<col> respectively, but other combinations are possible for certain layouts according to the table below.

| A/B Type | Shape     | ALayout | BLayout | A Type   | B Type   | C/D Type          |
|----------|-----------|---------|---------|----------|----------|-------------------|
| f64      | .m8n8k4   | row     | col     | 1x f64   | 1x f64   | 2x f64            |
| f16      | .m8n8k4   | row/col | row/col | 2x f16x2 | 2x f16x2 | 4x f16x2 or 8xf32 |
|          | .m16n8k8  | row     | col     | 2x f16x2 | 1x f16x2 | 2x f16x2 or 4 f32 |
|          | .m16n8k16 | row     | col     | 4x f16x2 | 2x f16x2 | 2x f16x2 or 4 f32 |
| bf16     | .m16n8k8  | row     | col     | 2x i32   | 1x i32   | 4x f32            |
|          | .m16n8k16 | row     | col     | 4x i32   | 2x i32   | 4x f32            |
| tf32     | .m16n8k4  | row     | col     | 2x i32   | 1x i32   | 4x f32            |
|          | .m16n8k8  | row     | col     | 4x i32   | 2x i32   | 2x f16x2 or 4 f32 |
| u8/s8    | .m8n8k16  | row     | col     | 1x i32   | 1x i32   | 2x i32            |
|          | .m16n8k16 | row     | col     | 2x i32   | 1x i32   | 4x i32            |
|          | .m16n8k32 | row     | col     | 4x i32   | 2x i32   | 4x i32            |
| u4/s4    | .m8n8k32  | row     | col     | 1x i32   | 1x i32   | 2x i32            |
|          | m16n8k32  | row     | col     | 2x i32   | 1x i32   | 4x i32            |
|          | m16n8k64  | row     | col     | 4x i32   | 2x i32   | 4x i32            |
| b1       | m8n8k128  | row     | col     | 1x i32   | 1x i32   | 2x i32            |
|          | m16n8k128 | row     | col     | 2x i32   | 1x i32   | 4x i32            |

Example:


%128 = nvvm.mma.sync A[%120, %121, %122, %123]
                     B[%124, %125]
                     C[%126, %127]
                     {layoutA = #nvvm.mma_layout<row>,
                      layoutB = #nvvm.mma_layout<col>,
                      shape = {k = 16 : i32, m = 16 : i32, n = 8 : i32}}
    : (vector<2xf16>, vector<2xf16>, vector<2xf16>)
       -> !llvm.struct<(vector<2xf16>, vector<2xf16>)>

Traits: AttrSizedOperandSegments

Attributes: 

AttributeMLIR TypeDescription
shape::mlir::NVVM::MMAShapeAttrAttribute for MMA operation shape.
b1Op::mlir::NVVM::MMAB1OpAttr
MMA binary operations

Enum cases:

  • none (none)
  • xor_popc (xor_popc)
  • and_popc (and_popc)
intOverflowBehavior::mlir::NVVM::MMAIntOverflowAttr
MMA overflow options

Enum cases:

  • satfinite (satfinite)
  • wrapped (wrapped)
layoutA::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)
layoutB::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)
multiplicandAPtxType::mlir::NVVM::MMATypesAttr
NVVM MMA types

Enum cases:

  • f16 (f16)
  • f32 (f32)
  • tf32 (tf32)
  • bf16 (bf16)
  • s8 (s8)
  • u8 (u8)
  • s32 (s32)
  • s4 (s4)
  • u4 (u4)
  • b1 (b1)
  • f64 (f64)
multiplicandBPtxType::mlir::NVVM::MMATypesAttr
NVVM MMA types

Enum cases:

  • f16 (f16)
  • f32 (f32)
  • tf32 (tf32)
  • bf16 (bf16)
  • s8 (s8)
  • u8 (u8)
  • s32 (s32)
  • s4 (s4)
  • u4 (u4)
  • b1 (b1)
  • f64 (f64)

Operands: 

OperandDescription
operandAvariadic of LLVM dialect-compatible type
operandBvariadic of LLVM dialect-compatible type
operandCvariadic of LLVM dialect-compatible type

Results: 

ResultDescription
resLLVM structure type

nvvm.prefetch.tensormap (NVVM::PrefetchTensorMapOp) 

Syntax:

operation ::= `nvvm.prefetch.tensormap` $tmaDescriptor (`,` `predicate` `=` $predicate^)? attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: 

OperandDescription
tmaDescriptorLLVM pointer type
predicate1-bit signless integer

nvvm.rcp.approx.ftz.f (NVVM::RcpApproxFtzF32Op) 

Syntax:

operation ::= `nvvm.rcp.approx.ftz.f` $arg attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: 

OperandDescription
arg32-bit float

Results: 

ResultDescription
res32-bit float

nvvm.read.ptx.sreg.clock (NVVM::ClockOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.clock` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.clock64 (NVVM::Clock64Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.clock64` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.cluster.ctaid.x (NVVM::BlockInClusterIdXOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.ctaid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.cluster.ctaid.y (NVVM::BlockInClusterIdYOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.ctaid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.cluster.ctaid.z (NVVM::BlockInClusterIdZOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.ctaid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.cluster.ctarank (NVVM::ClusterId) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.ctarank` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.cluster.nctaid.x (NVVM::ClusterDimBlocksXOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.nctaid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.cluster.nctaid.y (NVVM::ClusterDimBlocksYOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.nctaid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.cluster.nctaid.z (NVVM::ClusterDimBlocksZOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.nctaid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.cluster.nctarank (NVVM::ClusterDim) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.nctarank` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.clusterid.x (NVVM::ClusterIdXOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.clusterid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.clusterid.y (NVVM::ClusterIdYOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.clusterid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.clusterid.z (NVVM::ClusterIdZOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.clusterid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.ctaid.x (NVVM::BlockIdXOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.ctaid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.ctaid.y (NVVM::BlockIdYOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.ctaid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.ctaid.z (NVVM::BlockIdZOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.ctaid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg0 (NVVM::EnvReg0Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg0` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg1 (NVVM::EnvReg1Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg1` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg10 (NVVM::EnvReg10Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg10` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg11 (NVVM::EnvReg11Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg11` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg12 (NVVM::EnvReg12Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg12` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg13 (NVVM::EnvReg13Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg13` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg14 (NVVM::EnvReg14Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg14` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg15 (NVVM::EnvReg15Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg15` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg16 (NVVM::EnvReg16Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg16` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg17 (NVVM::EnvReg17Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg17` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg18 (NVVM::EnvReg18Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg18` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg19 (NVVM::EnvReg19Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg19` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg2 (NVVM::EnvReg2Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg2` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg20 (NVVM::EnvReg20Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg20` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg21 (NVVM::EnvReg21Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg21` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg22 (NVVM::EnvReg22Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg22` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg23 (NVVM::EnvReg23Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg23` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg24 (NVVM::EnvReg24Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg24` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg25 (NVVM::EnvReg25Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg25` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg26 (NVVM::EnvReg26Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg26` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg27 (NVVM::EnvReg27Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg27` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg28 (NVVM::EnvReg28Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg28` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg29 (NVVM::EnvReg29Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg29` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg3 (NVVM::EnvReg3Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg3` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg30 (NVVM::EnvReg30Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg30` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg31 (NVVM::EnvReg31Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg31` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg4 (NVVM::EnvReg4Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg4` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg5 (NVVM::EnvReg5Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg5` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg6 (NVVM::EnvReg6Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg6` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg7 (NVVM::EnvReg7Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg7` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg8 (NVVM::EnvReg8Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg8` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.envreg9 (NVVM::EnvReg9Op) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg9` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.globaltimer (NVVM::GlobalTimerOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.globaltimer` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.gridid (NVVM::GridIdOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.gridid` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.laneid (NVVM::LaneIdOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.laneid` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.lanemask.eq (NVVM::LaneMaskEqOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.lanemask.eq` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.lanemask.ge (NVVM::LaneMaskGeOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.lanemask.ge` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.lanemask.gt (NVVM::LaneMaskGtOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.lanemask.gt` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.lanemask.le (NVVM::LaneMaskLeOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.lanemask.le` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.lanemask.lt (NVVM::LaneMaskLtOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.lanemask.lt` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.nclusterid.x (NVVM::ClusterDimXOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.nclusterid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.nclusterid.y (NVVM::ClusterDimYOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.nclusterid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.nclusterid.z (NVVM::ClusterDimZOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.nclusterid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.nctaid.x (NVVM::GridDimXOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.nctaid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.nctaid.y (NVVM::GridDimYOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.nctaid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.nctaid.z (NVVM::GridDimZOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.nctaid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.nsmid (NVVM::SmDimOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.nsmid` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.ntid.x (NVVM::BlockDimXOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.ntid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.ntid.y (NVVM::BlockDimYOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.ntid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.ntid.z (NVVM::BlockDimZOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.ntid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.nwarpid (NVVM::WarpDimOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.nwarpid` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.smid (NVVM::SmIdOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.smid` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.tid.x (NVVM::ThreadIdXOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.tid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.tid.y (NVVM::ThreadIdYOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.tid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.tid.z (NVVM::ThreadIdZOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.tid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.warpid (NVVM::WarpIdOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.warpid` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.read.ptx.sreg.warpsize (NVVM::WarpSizeOp) 

Syntax:

operation ::= `nvvm.read.ptx.sreg.warpsize` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: 

AttributeMLIR TypeDescription
range::mlir::LLVM::ConstantRangeAttr
A range of two integers, corresponding to LLVM's ConstantRange
A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.

The range represented is [Lower, Upper), and is either signed or unsigned depending on context.

lower and upper must have the same width.

Syntax:

`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.redux.sync (NVVM::ReduxOp) 

Redux Sync Op

Syntax:

operation ::= `nvvm.redux.sync` $kind $val `,` $mask_and_clamp  attr-dict `:` type($val) `->` type($res)

redux.sync performs a reduction operation kind of the 32 bit source register across all non-exited threads in the membermask.

The abs and nan attributes can be used in the case of f32 input type, where the abs attribute causes the absolute value of the input to be used in the reduction operation, and the nan attribute causes the reduction operation to return NaN if any of the inputs to participating threads are NaN.

For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
kind::mlir::NVVM::ReduxKindAttr
NVVM redux kind

Enum cases:

  • add (ADD)
  • and (AND)
  • max (MAX)
  • min (MIN)
  • or (OR)
  • umax (UMAX)
  • umin (UMIN)
  • xor (XOR)
  • fmin (FMIN)
  • fmax (FMAX)
abs::mlir::BoolAttrbool attribute
nan::mlir::BoolAttrbool attribute

Operands: 

OperandDescription
valLLVM dialect-compatible type
mask_and_clamp32-bit signless integer

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.setmaxregister (NVVM::SetMaxRegisterOp) 

Syntax:

operation ::= `nvvm.setmaxregister` $action $regCount attr-dict

Attributes: 

AttributeMLIR TypeDescription
regCount::mlir::IntegerAttr32-bit signless integer attribute
action::mlir::NVVM::SetMaxRegisterActionAttr
NVVM set max register action

Enum cases:

  • decrease (decrease)
  • increase (increase)

nvvm.shfl.sync (NVVM::ShflOp) 

NVVM Dialect Op for shfl.sync

Syntax:

operation ::= `nvvm.shfl.sync` $kind $thread_mask `,` $val `,` $offset `,` $mask_and_clamp  attr-dict
              `:` type($val) `->` type($res)

The shfl.sync Op implements data shuffle within threads of a warp. The thread_mask denotes the threads participating in the Op where the bit position corresponds to a particular thread’s laneid. The offset specifies a source lane or source lane offset (depending on kind). The val is the input value to be copied from the source. The mask_and_clamp contains two packed values specifying a mask for logically splitting warps into sub-segments and an upper bound for clamping the source lane index.

For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
kind::mlir::NVVM::ShflKindAttr
NVVM shuffle kind

Enum cases:

  • bfly (bfly)
  • up (up)
  • down (down)
  • idx (idx)
return_value_and_is_valid::mlir::UnitAttrunit attribute

Operands: 

OperandDescription
thread_mask32-bit signless integer
valLLVM dialect-compatible type
offset32-bit signless integer
mask_and_clamp32-bit signless integer

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.stmatrix (NVVM::StMatrixOp) 

Cooperative matrix store

Syntax:

operation ::= `nvvm.stmatrix` $ptr `,` $sources attr-dict `:` type(operands)

Collectively store one or more matrices across all threads in a warp to the location indicated by the address operand $ptr in shared memory.

For more information, see PTX ISA

Interfaces: BasicPtxBuilderInterface

Attributes: 

AttributeMLIR TypeDescription
layout::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)

Operands: 

OperandDescription
ptrLLVM pointer in address space 3
sourcesvariadic of 32-bit signless integer

nvvm.tcgen05.alloc (NVVM::Tcgen05AllocOp) 

Tcgen05 alloc operation

Syntax:

operation ::= `nvvm.tcgen05.alloc` $addr `,` $nCols attr-dict `:` type(operands)

The tcgen05.alloc Op allocates tensor core memory for the amount specified by nCols and writes the destination address to the addr argument. The nCols operand specifies the number of columns to be allocated and it must be a power-of-two. For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
group::mlir::NVVM::Tcgen05GroupKindAttr
NVVM Tcgen05 group kind

Enum cases:

  • cta_1 (CTA_1)
  • cta_2 (CTA_2)

Operands: 

OperandDescription
addrLLVM pointer type or LLVM pointer in address space 3
nCols32-bit signless integer

nvvm.tcgen05.commit (NVVM::Tcgen05CommitOp) 

Tcgen05 commit operations

Syntax:

operation ::= `nvvm.tcgen05.commit` $addr (`,` `multicast_mask` `=` $multicastMask^)?
              attr-dict `:` type(operands)

The tcgen05.commit makes the mbarrier object, specified by the operand addr, track the completion of all the prior async-tcgen05 operations initiated by the executing thread. The multicast variants allow signaling on the mbarrier objects of multiple CTAs within the cluster. Operand multicastMask, when present, specifies the destination CTAs in the cluster such that each bit position in the 16-bit multicastMask operand corresponds to the nvvm.read.ptx.sreg.ctaid of the destination CTA. For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
group::mlir::NVVM::Tcgen05GroupKindAttr
NVVM Tcgen05 group kind

Enum cases:

  • cta_1 (CTA_1)
  • cta_2 (CTA_2)

Operands: 

OperandDescription
addrLLVM pointer type or LLVM pointer in address space 3
multicastMask16-bit signless integer

nvvm.tcgen05.cp (NVVM::Tcgen05CpOp) 

Tcgen05 copy operation

Syntax:

operation ::= `nvvm.tcgen05.cp` $taddr`,` $smem_desc attr-dict

Instruction tcgen05.cp initiates an asynchronous copy operation from shared memory to the location specified by the address operand taddr in the Tensor Memory. The 64-bit register operand smem_desc specifies the matrix descriptor representing the source matrix in the shared memory that needs to be copied.

Example:

  nvvm.tcgen05.cp %taddr, %smem_desc {
    group = #nvvm.tcgen05_group<cta_2>,
    shape = #nvvm.tcgen05_cp_shape<shape_64x128b>,
    multicast = #nvvm.tcgen05_cp_multicast<warpx2_01_23>,
    srcFormat = #nvvm.tcgen05_cp_src_fmt<b6x16_p32>
  }

For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
shape::mlir::NVVM::Tcgen05CpShapeAttr
tcgen05 cp shapes

Enum cases:

  • shape_128x256b (SHAPE_128x256b)
  • shape_4x256b (SHAPE_4x256b)
  • shape_128x128b (SHAPE_128x128b)
  • shape_64x128b (SHAPE_64x128b)
  • shape_32x128b (SHAPE_32x128b)
group::mlir::NVVM::Tcgen05GroupKindAttr
NVVM Tcgen05 group kind

Enum cases:

  • cta_1 (CTA_1)
  • cta_2 (CTA_2)
multicast::mlir::NVVM::Tcgen05CpMulticastAttr
tcgen05 cp multicast

Enum cases:

  • none (NONE)
  • warpx2_02_13 (WARPX2_02_13)
  • warpx2_01_23 (WARPX2_01_23)
  • warpx4 (WARPX4)
srcFormat::mlir::NVVM::Tcgen05CpSrcFormatAttr
tcgen05 cp source format

Enum cases:

  • b6x16_p32 (B6x16_P32)
  • b4x16_p64 (B4x16_P64)

Operands: 

OperandDescription
taddrLLVM pointer in address space 6
smem_desc64-bit signless integer

nvvm.tcgen05.dealloc (NVVM::Tcgen05DeallocOp) 

Tcgen05 dealloc operation

Syntax:

operation ::= `nvvm.tcgen05.dealloc` $taddr `,` $nCols attr-dict `:` type(operands)

The tcgen05.dealloc Op de-allocates the tensor core memory specified by tmemAddr, which must be from a previous tensor memory allocation. The nCols operand specifies the number of columns to be de-allocated, and it must be a power-of-two. For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
group::mlir::NVVM::Tcgen05GroupKindAttr
NVVM Tcgen05 group kind

Enum cases:

  • cta_1 (CTA_1)
  • cta_2 (CTA_2)

Operands: 

OperandDescription
taddrLLVM pointer in address space 6
nCols32-bit signless integer

nvvm.tcgen05.fence (NVVM::Tcgen05FenceOp) 

Tcgen05 fence operations

Syntax:

operation ::= `nvvm.tcgen05.fence` $kind attr-dict

The tcgen05.fence<before> orders all prior async tcgen05 operations with respect to the subsequent tcgen05 and execution ordering operations. The tcgen05.fence<after> orders all subsequent async tcgen05 operations with respect to the prior tcgen05 and execution ordering operations.

For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
kind::mlir::NVVM::Tcgen05FenceKindAttr
NVVM Tcgen05 fence kind

Enum cases:

  • before (BEFORE_THREAD_SYNC)
  • after (AFTER_THREAD_SYNC)

nvvm.tcgen05.ld (NVVM::Tcgen05LdOp) 

Tensor memory load instructions

Syntax:

operation ::= `nvvm.tcgen05.ld` $tmemAddr (`,` $offset^)? (`pack` $pack^)? attr-dict `:` type($res)

Instruction tcgen05.ld asynchronously loads data from the Tensor Memory at the location specified by the 32-bit address operand tmemAddr into the destination register res, collectively across all threads of the warps.

The shape and the num attribute together determines the total dimension of the data which is loaded from the Tensor Memory. The shape attribute indicates the base dimension of data to be accessed as described in the Data Movement Shape. The num attribute indicates the repeat factor on the base dimension resulting in the total dimension of the data that is accessed.

The shape 16x32bx2 performs two accesses into Tensor Memory of the shape 16x32b. The base address of the first access is specified by tmemAddr and the base address of the second access is specified by tmemAddr + offset, where offset is an immediate argument.

The unit attribute pack can be used to pack two 16-bit elements from adjacent columns into a single 32-bit element during the load.

The following table describes the size of the vector for various combinations of num and shape attributes |=====================================================================| | num/shape | 16x32bx2/16x64b/32x32b | 16x128b | 16x256b | |=====================================================================| | x1 | 1 | 2 | 4 | | x2 | 2 | 4 | 8 | | x4 | 4 | 8 | 16 | | x8 | 8 | 16 | 32 | | x16 | 16 | 32 | 64 | | x32 | 32 | 64 | 128 | | x64 | 64 | 128 | NA | | x128 | 128 | NA | NA | |=====================================================================|

Example:

  nvvm.tcgen05.ld %tmemAddr, %offset pack {
    shape = #nvvm.tcgen05_ldst_shape<shape_16x32bx2>,
  } : <2xi32>

For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
pack::mlir::UnitAttrunit attribute
shape::mlir::NVVM::Tcgen05LdStShapeAttr
allowed 32-bit signless integer cases: 0, 1, 2, 3, 4

Enum cases:

  • shape_16x64b (SHAPE_16X64B)
  • shape_16x128b (SHAPE_16X128B)
  • shape_16x256b (SHAPE_16X256B)
  • shape_32x32b (SHAPE_32X32B)
  • shape_16x32bx2 (SHAPE_16X32BX2)

Operands: 

OperandDescription
tmemAddrLLVM pointer in address space 6
offset64-bit signless integer

Results: 

ResultDescription
res32-bit signless integer or vector of 32-bit signless integer values of length 2/4/8/16/32/64/128

nvvm.tcgen05.relinquish_alloc_permit (NVVM::Tcgen05RelinquishAllocPermitOp) 

Tcgen05 Op to relinquish the right to allocate

Syntax:

operation ::= `nvvm.tcgen05.relinquish_alloc_permit` attr-dict

The tcgen05.relinquish_alloc_permit Op specifies that the CTA of the executing thread is relinquishing the right to allocate Tensor Memory. So, it is illegal for a CTA to perform tcgen05.alloc after any of its constituent threads execute tcgen05.relinquish_alloc_permit. For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
group::mlir::NVVM::Tcgen05GroupKindAttr
NVVM Tcgen05 group kind

Enum cases:

  • cta_1 (CTA_1)
  • cta_2 (CTA_2)

nvvm.tcgen05.shift (NVVM::Tcgen05ShiftOp) 

Tcgen05 shift operation

Syntax:

operation ::= `nvvm.tcgen05.shift` $taddr attr-dict `:` type(operands)

The tcgen05.shift is an asynchronous instruction which initiates the shifting of 32-byte elements downwards across all the rows, except the last, by one row. The operand taddr specifies the base address of the matrix in Tensor Memory whose rows must be down shifted.

For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
group::mlir::NVVM::Tcgen05GroupKindAttr
NVVM Tcgen05 group kind

Enum cases:

  • cta_1 (CTA_1)
  • cta_2 (CTA_2)

Operands: 

OperandDescription
taddrLLVM pointer in address space 6

nvvm.tcgen05.st (NVVM::Tcgen05StOp) 

Tensor memory store instructions

Syntax:

operation ::= `nvvm.tcgen05.st` $tmemAddr `,` $val (`,` $offset^)? (`unpack` $unpack^)? attr-dict `:` type($val)

Instruction tcgen05.st asynchronously stores data from the source register r into the Tensor Memory at the location specified by the 32-bit address operand tmemAddr, collectively across all threads of the warps.

The shape and the num attribute together determines the total dimension of the data which is stored to the Tensor Memory. The shape indicates the base dimension of data to be accessed. The num attribute indicates the repeat factor on the base dimension resulting in the total dimension of the data that is accessed.

The shape 16x32bx2 performs two accesses into Tensor Memory of the shape 16x32b. The base address of the first access is specified by tmemAddr and the base address of the second access is specified by tmemAddr + offset, where offset is an immediate argument.

The unit attribute unpack can be used to unpack a 32-bit element in the register into two 16-bit elements and store them in adjacent columns.

The following table describes the size of the vector for various combinations of num and shape attributes |=====================================================================| | num/shape | 16x32bx2/16x64b/32x32b | 16x128b | 16x256b | |=====================================================================| | x1 | 1 | 2 | 4 | | x2 | 2 | 4 | 8 | | x4 | 4 | 8 | 16 | | x8 | 8 | 16 | 32 | | x16 | 16 | 32 | 64 | | x32 | 32 | 64 | 128 | | x64 | 64 | 128 | NA | | x128 | 128 | NA | NA | |=====================================================================|

Example:

  nvvm.tcgen05.st %tmemAddr, %val, %offset unpack {
    shape = #nvvm.tcgen05_ldst_shape<shape_16x32bx2>,
  } : <2xi32>

For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
unpack::mlir::UnitAttrunit attribute
shape::mlir::NVVM::Tcgen05LdStShapeAttr
allowed 32-bit signless integer cases: 0, 1, 2, 3, 4

Enum cases:

  • shape_16x64b (SHAPE_16X64B)
  • shape_16x128b (SHAPE_16X128B)
  • shape_16x256b (SHAPE_16X256B)
  • shape_32x32b (SHAPE_32X32B)
  • shape_16x32bx2 (SHAPE_16X32BX2)

Operands: 

OperandDescription
tmemAddrLLVM pointer in address space 6
val32-bit signless integer or vector of 32-bit signless integer values of length 2/4/8/16/32/64/128
offset64-bit signless integer

nvvm.tcgen05.wait (NVVM::Tcgen05WaitOp) 

Tcgen05 wait operations

Syntax:

operation ::= `nvvm.tcgen05.wait` $kind attr-dict

The tcgen05.wait<load> causes the executing thread to block until all prior tcgen05.ld operations issued by the executing thread have completed. Similarly, the tcgen05.wait<store> causes the executing thread to block until all prior tcgen05.st operations issued by the executing thread have completed. For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
kind::mlir::NVVM::Tcgen05WaitKindAttr
NVVM Tcgen05 wait kind

Enum cases:

  • load (LOAD)
  • store (STORE)

nvvm.vote.ballot.sync (NVVM::VoteBallotOp) 

Operands: 

OperandDescription
maskLLVM dialect-compatible type
predLLVM dialect-compatible type

Results: 

ResultDescription
resLLVM dialect-compatible type

nvvm.wgmma.commit.group.sync.aligned (NVVM::WgmmaGroupSyncAlignedOp) 

Syntax:

operation ::= `nvvm.wgmma.commit.group.sync.aligned` attr-dict

Commits all prior uncommitted warpgroup level matrix multiplication operations.

For more information, see PTX ISA

nvvm.wgmma.fence.aligned (NVVM::WgmmaFenceAlignedOp) 

Syntax:

operation ::= `nvvm.wgmma.fence.aligned` attr-dict

Enforce an ordering of register accesses between warpgroup level matrix multiplication and other operations.

For more information, see PTX ISA

nvvm.wgmma.mma_async (NVVM::WgmmaMmaAsyncOp) 

Syntax:

operation ::= `nvvm.wgmma.mma_async` $descriptorA `,` $descriptorB `,` $inouts `,` $shape `,`
              `D` `[` $typeD `,` $scaleD (`,` $satfinite^)? `]` `,`
              `A` `[` $typeA `,` $scaleA `,` $layoutA `]` `,`
              `B` `[` $typeB `,` $scaleB `,` $layoutB `]`
              attr-dict `:`
              type($inouts) `->` type($results)

The warpgroup (128 threads) level matrix multiply and accumulate operation has either of the following forms, where matrix D is called accumulator: D = A * B + D D = A * B, where the input from accumulator D is disabled.

Supported shapes:

|--------------|--------------|------------|--------------|---------------|
|              |              |            |              |f16+=e4m3*e4m3 |
|              |              |            |              |f16+=e5m2*e5m2 |
|f32+=tf32*tf32|f16+=f16 *f16 | s32+=s8*s8 |s32 += b1 * b1|f16+=e5m2*e4m3 |
|              |f32+=f16 *f16 | s32+=u8*u8 |              |f16+=e4m3*e5m2 |
|              |f32+=bf16*bf16| s32+=u8*u8 |              |f16+=e4m3*e5m2 |
|              |f32+=bf16*bf16| s32+=s8*u8 |              |f32+=e4m3*e4m3 |
|              |              | s32+=u8*s8 |              |f32+=e5m2*e5m2 |
|              |              |            |              |f32+=e4m3*e5m2 |
|              |              |            |              |f32+=e4m3*e5m2 |
|--------------|--------------|------------|--------------|---------------|
|   .m64n8k8   |  .m64n8k16   | .m64n8k32  | .m64n8k256   | .m64n8k32     |
|   .m64n16k8  |  .m64n16k16  | .m64n16k32 | .m64n16k256  | .m64n16k32    |
|   .m64n24k8  |  .m64n24k16  | .m64n24k32 | .m64n24k256  | .m64n24k32    |
|   .m64n32k8  |  .m64n32k16  | .m64n32k32 | .m64n32k256  | .m64n32k32    |
|   .m64n40k8  |  .m64n40k16  | .m64n48k32 | .m64n48k256  | .m64n40k32    |
|   .m64n48k8  |  .m64n48k16  | .m64n64k32 | .m64n64k256  | .m64n48k32    |
|   .m64n56k8  |  .m64n56k16  | .m64n80k32 | .m64n80k256  | .m64n56k32    |
|   .m64n64k8  |  .m64n64k16  | .m64n96k32 | .m64n96k256  | .m64n64k32    |
|   .m64n72k8  |  .m64n72k16  | .m64n112k32| .m64n112k256 | .m64n72k32    |
|   .m64n80k8  |  .m64n80k16  | .m64n128k32| .m64n128k256 | .m64n80k32    |
|   .m64n88k8  |  .m64n88k16  | .m64n144k32| .m64n144k256 | .m64n88k32    |
|   .m64n96k8  |  .m64n96k16  | .m64n160k32| .m64n160k256 | .m64n96k32    |
|   .m64n104k8 |  .m64n104k16 | .m64n176k32| .m64n176k256 | .m64n104k32   |
|   .m64n112k8 |  .m64n112k16 | .m64n192k32| .m64n192k256 | .m64n112k32   |
|   .m64n120k8 |  .m64n120k16 | .m64n208k32| .m64n208k256 | .m64n120k32   |
|   .m64n128k8 |  .m64n128k16 | .m64n224k32| .m64n224k256 | .m64n128k32   |
|   .m64n136k8 |  .m64n136k16 | .m64n240k32| .m64n240k256 | .m64n136k32   |
|   .m64n144k8 |  .m64n144k16 | .m64n256k32| .m64n256k256 | .m64n144k32   |
|   .m64n152k8 |  .m64n152k16 |            |              | .m64n152k32   |
|   .m64n160k8 |  .m64n160k16 |            |              | .m64n160k32   |
|   .m64n168k8 |  .m64n168k16 |            |              | .m64n168k32   |
|   .m64n176k8 |  .m64n176k16 |            |              | .m64n176k32   |
|   .m64n184k8 |  .m64n184k16 |            |              | .m64n184k32   |
|   .m64n192k8 |  .m64n192k16 |            |              | .m64n192k32   |
|   .m64n200k8 |  .m64n200k16 |            |              | .m64n200k32   |
|   .m64n208k8 |  .m64n208k16 |            |              | .m64n208k32   |
|   .m64n216k8 |  .m64n216k16 |            |              | .m64n216k32   |
|   .m64n224k8 |  .m64n224k16 |            |              | .m64n224k32   |
|   .m64n232k8 |  .m64n232k16 |            |              | .m64n232k32   |
|   .m64n240k8 |  .m64n240k16 |            |              | .m64n240k32   |
|   .m64n248k8 |  .m64n248k16 |            |              | .m64n248k32   |
|   .m64n256k8 |  .m64n256k16 |            |              | .m64n256k32   |
|--------------|--------------|------------|--------------|---------------|

For more information, see PTX ISA

Interfaces: BasicPtxBuilderInterface

Attributes: 

AttributeMLIR TypeDescription
shape::mlir::NVVM::MMAShapeAttrAttribute for MMA operation shape.
typeA::mlir::NVVM::WGMMATypesAttr
NVVM WGMMA types

Enum cases:

  • f16 (f16)
  • tf32 (tf32)
  • u8 (u8)
  • s8 (s8)
  • b1 (b1)
  • bf16 (bf16)
  • e4m3 (e4m3)
  • e5m2 (e5m2)
  • f32 (f32)
  • s32 (s32)
typeB::mlir::NVVM::WGMMATypesAttr
NVVM WGMMA types

Enum cases:

  • f16 (f16)
  • tf32 (tf32)
  • u8 (u8)
  • s8 (s8)
  • b1 (b1)
  • bf16 (bf16)
  • e4m3 (e4m3)
  • e5m2 (e5m2)
  • f32 (f32)
  • s32 (s32)
typeD::mlir::NVVM::WGMMATypesAttr
NVVM WGMMA types

Enum cases:

  • f16 (f16)
  • tf32 (tf32)
  • u8 (u8)
  • s8 (s8)
  • b1 (b1)
  • bf16 (bf16)
  • e4m3 (e4m3)
  • e5m2 (e5m2)
  • f32 (f32)
  • s32 (s32)
scaleD::mlir::NVVM::WGMMAScaleOutAttr
WGMMA input predicate

Enum cases:

  • zero (zero)
  • one (one)
scaleA::mlir::NVVM::WGMMAScaleInAttr
WGMMA overflow options

Enum cases:

  • one (one)
  • neg (neg)
scaleB::mlir::NVVM::WGMMAScaleInAttr
WGMMA overflow options

Enum cases:

  • one (one)
  • neg (neg)
layoutA::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)
layoutB::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)
satfinite::mlir::NVVM::MMAIntOverflowAttr
MMA overflow options

Enum cases:

  • satfinite (satfinite)
  • wrapped (wrapped)

Operands: 

OperandDescription
inoutsLLVM structure type
descriptorA64-bit signless integer
descriptorB64-bit signless integer

Results: 

ResultDescription
resultsLLVM structure type

nvvm.wgmma.wait.group.sync.aligned (NVVM::WgmmaWaitGroupSyncOp) 

Syntax:

operation ::= `nvvm.wgmma.wait.group.sync.aligned` attr-dict $group

Signal the completion of a preceding warpgroup operation.

For more information, see PTX ISA

Attributes: 

AttributeMLIR TypeDescription
group::mlir::IntegerAttr64-bit signless integer attribute

nvvm.wmma.load (NVVM::WMMALoadOp) 

Warp synchronous matrix load

Syntax:

operation ::= `nvvm.wmma.load` $ptr `,` $stride attr-dict `:` functional-type($ptr, $res)

Attributes: 

AttributeMLIR TypeDescription
m::mlir::IntegerAttr32-bit signless integer attribute
n::mlir::IntegerAttr32-bit signless integer attribute
k::mlir::IntegerAttr32-bit signless integer attribute
layout::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)
eltype::mlir::NVVM::MMATypesAttr
NVVM MMA types

Enum cases:

  • f16 (f16)
  • f32 (f32)
  • tf32 (tf32)
  • bf16 (bf16)
  • s8 (s8)
  • u8 (u8)
  • s32 (s32)
  • s4 (s4)
  • u4 (u4)
  • b1 (b1)
  • f64 (f64)
frag::mlir::NVVM::MMAFragAttr
NVVM MMA frag type

Enum cases:

  • a (a)
  • b (b)
  • c (c)

Operands: 

OperandDescription
ptrLLVM pointer type
stride32-bit signless integer

Results: 

ResultDescription
resLLVM structure type

nvvm.wmma.mma (NVVM::WMMAMmaOp) 

Warp synchronous matrix-multiply accumulate using tensor cores.

Syntax:

operation ::= `nvvm.wmma.mma` $args attr-dict `:` functional-type($args, $res)

Attributes: 

AttributeMLIR TypeDescription
m::mlir::IntegerAttr32-bit signless integer attribute
n::mlir::IntegerAttr32-bit signless integer attribute
k::mlir::IntegerAttr32-bit signless integer attribute
layoutA::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)
layoutB::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)
eltypeA::mlir::NVVM::MMATypesAttr
NVVM MMA types

Enum cases:

  • f16 (f16)
  • f32 (f32)
  • tf32 (tf32)
  • bf16 (bf16)
  • s8 (s8)
  • u8 (u8)
  • s32 (s32)
  • s4 (s4)
  • u4 (u4)
  • b1 (b1)
  • f64 (f64)
eltypeB::mlir::NVVM::MMATypesAttr
NVVM MMA types

Enum cases:

  • f16 (f16)
  • f32 (f32)
  • tf32 (tf32)
  • bf16 (bf16)
  • s8 (s8)
  • u8 (u8)
  • s32 (s32)
  • s4 (s4)
  • u4 (u4)
  • b1 (b1)
  • f64 (f64)

Operands: 

OperandDescription
argsvariadic of LLVM dialect-compatible type

Results: 

ResultDescription
resLLVM structure type

nvvm.wmma.store (NVVM::WMMAStoreOp) 

Warp synchronous matrix store

Syntax:

operation ::= `nvvm.wmma.store` $ptr `,` $stride `,` $args attr-dict `:` qualified(type($ptr)) `,`
              type($args)

Attributes: 

AttributeMLIR TypeDescription
m::mlir::IntegerAttr32-bit signless integer attribute
n::mlir::IntegerAttr32-bit signless integer attribute
k::mlir::IntegerAttr32-bit signless integer attribute
layout::mlir::NVVM::MMALayoutAttr
NVVM MMA layout

Enum cases:

  • row (row)
  • col (col)
eltype::mlir::NVVM::MMATypesAttr
NVVM MMA types

Enum cases:

  • f16 (f16)
  • f32 (f32)
  • tf32 (tf32)
  • bf16 (bf16)
  • s8 (s8)
  • u8 (u8)
  • s32 (s32)
  • s4 (s4)
  • u4 (u4)
  • b1 (b1)
  • f64 (f64)

Operands: 

OperandDescription
ptrLLVM pointer type
argsvariadic of LLVM dialect-compatible type
stride32-bit signless integer

Attributes 

FPRoundingModeAttr 

NVVM FPRoundingMode kind

Syntax:

#nvvm.fp_rnd_mode<
  ::mlir::NVVM::FPRoundingMode   # value
>

Enum cases:

  • none (NONE)
  • rn (RN)
  • rm (RM)
  • rp (RP)
  • rz (RZ)
  • rna (RNA)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::FPRoundingModean enum of type FPRoundingMode

LoadCacheModifierKindAttr 

NVVM load cache modifier kind

Syntax:

#nvvm.load_cache_modifier<
  ::mlir::NVVM::LoadCacheModifierKind   # value
>

Enum attribute of the different kinds of cache operators for load instructions.

For more information, see PTX ISA

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::LoadCacheModifierKindan enum of type LoadCacheModifierKind

MMAB1OpAttr 

MMA binary operations

Syntax:

#nvvm.mma_b1op<
  ::mlir::NVVM::MMAB1Op   # value
>

Enum cases:

  • none (none)
  • xor_popc (xor_popc)
  • and_popc (and_popc)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::MMAB1Opan enum of type MMAB1Op

MMAFragAttr 

NVVM MMA frag type

Syntax:

#nvvm.mma_frag<
  ::mlir::NVVM::MMAFrag   # value
>

Enum cases:

  • a (a)
  • b (b)
  • c (c)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::MMAFragan enum of type MMAFrag

MMAIntOverflowAttr 

MMA overflow options

Syntax:

#nvvm.mma_int_overflow<
  ::mlir::NVVM::MMAIntOverflow   # value
>

Enum cases:

  • satfinite (satfinite)
  • wrapped (wrapped)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::MMAIntOverflowan enum of type MMAIntOverflow

MMALayoutAttr 

NVVM MMA layout

Syntax:

#nvvm.mma_layout<
  ::mlir::NVVM::MMALayout   # value
>

Enum cases:

  • row (row)
  • col (col)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::MMALayoutan enum of type MMALayout

MMATypesAttr 

NVVM MMA types

Syntax:

#nvvm.mma_type<
  ::mlir::NVVM::MMATypes   # value
>

Enum cases:

  • f16 (f16)
  • f32 (f32)
  • tf32 (tf32)
  • bf16 (bf16)
  • s8 (s8)
  • u8 (u8)
  • s32 (s32)
  • s4 (s4)
  • u4 (u4)
  • b1 (b1)
  • f64 (f64)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::MMATypesan enum of type MMATypes

MatchSyncKindAttr 

NVVM match sync kind

Syntax:

#nvvm.match_sync_kind<
  ::mlir::NVVM::MatchSyncKind   # value
>

Enum cases:

  • any (any)
  • all (all)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::MatchSyncKindan enum of type MatchSyncKind

MemScopeKindAttr 

NVVM Memory Scope kind

Syntax:

#nvvm.mem_scope<
  ::mlir::NVVM::MemScopeKind   # value
>

Enum cases:

  • cta (CTA)
  • cluster (CLUSTER)
  • gpu (GPU)
  • sys (SYS)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::MemScopeKindan enum of type MemScopeKind

MMAShapeAttr 

Attribute for MMA operation shape.

Syntax:

#nvvm.shape<
  int,   # m
  int,   # n
  int   # k
>

Parameters: 

ParameterC++ typeDescription
mint
nint
kint

NVVMTargetAttr 

Syntax:

#nvvm.target<
  int,   # O
  ::llvm::StringRef,   # triple
  ::llvm::StringRef,   # chip
  ::llvm::StringRef,   # features
  DictionaryAttr,   # flags
  ArrayAttr   # link
>

GPU target attribute for controlling compilation of NVIDIA targets. All parameters decay into default values if not present.

Examples:

  1. Target with default values.
  gpu.module @mymodule [#nvvm.target] attributes {...} {
    ...
  }
  1. Target with sm_90 chip and fast math.
  gpu.module @mymodule [#nvvm.target<chip = "sm_90", flags = {fast}>] {
    ...
  }

Parameters: 

ParameterC++ typeDescription
OintOptimization level to apply.
triple::llvm::StringRefTarget triple.
chip::llvm::StringRefTarget chip.
features::llvm::StringRefTarget chip features.
flagsDictionaryAttrTarget specific flags.
linkArrayAttrFiles to link to the LLVM module.

ProxyKindAttr 

Proxy kind

Syntax:

#nvvm.proxy_kind<
  ::mlir::NVVM::ProxyKind   # value
>

Enum cases:

  • alias (alias)
  • async (async)
  • async.global (async_global)
  • async.shared (async_shared)
  • tensormap (TENSORMAP)
  • generic (GENERIC)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::ProxyKindan enum of type ProxyKind

ReduxKindAttr 

NVVM redux kind

Syntax:

#nvvm.redux_kind<
  ::mlir::NVVM::ReduxKind   # value
>

Enum cases:

  • add (ADD)
  • and (AND)
  • max (MAX)
  • min (MIN)
  • or (OR)
  • umax (UMAX)
  • umin (UMIN)
  • xor (XOR)
  • fmin (FMIN)
  • fmax (FMAX)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::ReduxKindan enum of type ReduxKind

SaturationModeAttr 

NVVM SaturationMode kind

Syntax:

#nvvm.sat_mode<
  ::mlir::NVVM::SaturationMode   # value
>

Enum cases:

  • none (NONE)
  • satfinite (SATFINITE)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::SaturationModean enum of type SaturationMode

SetMaxRegisterActionAttr 

NVVM set max register action

Syntax:

#nvvm.action<
  ::mlir::NVVM::SetMaxRegisterAction   # value
>

Enum cases:

  • decrease (decrease)
  • increase (increase)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::SetMaxRegisterActionan enum of type SetMaxRegisterAction

SharedSpaceAttr 

Shared memory space

Syntax:

#nvvm.shared_space<
  ::mlir::NVVM::SharedSpace   # value
>

Enum cases:

  • cta (shared_cta)
  • cluster (shared_cluster)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::SharedSpacean enum of type SharedSpace

ShflKindAttr 

NVVM shuffle kind

Syntax:

#nvvm.shfl_kind<
  ::mlir::NVVM::ShflKind   # value
>

Enum cases:

  • bfly (bfly)
  • up (up)
  • down (down)
  • idx (idx)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::ShflKindan enum of type ShflKind

TMAReduxKindAttr 

NVVM TMA redux kind

Syntax:

#nvvm.tma_redux_kind<
  ::mlir::NVVM::TMAReduxKind   # value
>

Enum cases:

  • add (ADD)
  • max (MAX)
  • min (MIN)
  • inc (INC)
  • dec (DEC)
  • and (AND)
  • or (OR)
  • xor (XOR)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::TMAReduxKindan enum of type TMAReduxKind

TMAStoreModeAttr 

NVVM TMA Store Mode

Syntax:

#nvvm.tma_store_mode<
  ::mlir::NVVM::TMAStoreMode   # value
>

Enum cases:

  • tile (TILE)
  • im2col (IM2COL)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::TMAStoreModean enum of type TMAStoreMode

Tcgen05CpMulticastAttr 

Tcgen05 cp multicast

Syntax:

#nvvm.tcgen05_cp_multicast<
  ::mlir::NVVM::Tcgen05CpMulticast   # value
>

Enum cases:

  • none (NONE)
  • warpx2_02_13 (WARPX2_02_13)
  • warpx2_01_23 (WARPX2_01_23)
  • warpx4 (WARPX4)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::Tcgen05CpMulticastan enum of type Tcgen05CpMulticast

Tcgen05CpShapeAttr 

Tcgen05 cp shapes

Syntax:

#nvvm.tcgen05_cp_shape<
  ::mlir::NVVM::Tcgen05CpShape   # value
>

Enum cases:

  • shape_128x256b (SHAPE_128x256b)
  • shape_4x256b (SHAPE_4x256b)
  • shape_128x128b (SHAPE_128x128b)
  • shape_64x128b (SHAPE_64x128b)
  • shape_32x128b (SHAPE_32x128b)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::Tcgen05CpShapean enum of type Tcgen05CpShape

Tcgen05CpSrcFormatAttr 

Tcgen05 cp source format

Syntax:

#nvvm.tcgen05_cp_src_fmt<
  ::mlir::NVVM::Tcgen05CpSrcFormat   # value
>

Enum cases:

  • b6x16_p32 (B6x16_P32)
  • b4x16_p64 (B4x16_P64)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::Tcgen05CpSrcFormatan enum of type Tcgen05CpSrcFormat

Tcgen05FenceKindAttr 

NVVM Tcgen05 fence kind

Syntax:

#nvvm.tcgen05_fence<
  ::mlir::NVVM::Tcgen05FenceKind   # value
>

Enum cases:

  • before (BEFORE_THREAD_SYNC)
  • after (AFTER_THREAD_SYNC)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::Tcgen05FenceKindan enum of type Tcgen05FenceKind

Tcgen05GroupKindAttr 

NVVM Tcgen05 group kind

Syntax:

#nvvm.tcgen05_group<
  ::mlir::NVVM::Tcgen05GroupKind   # value
>

Enum cases:

  • cta_1 (CTA_1)
  • cta_2 (CTA_2)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::Tcgen05GroupKindan enum of type Tcgen05GroupKind

Tcgen05LdStShapeAttr 

Allowed 32-bit signless integer cases: 0, 1, 2, 3, 4

Syntax:

#nvvm.tcgen05_ldst_shape<
  ::mlir::NVVM::Tcgen05LdStShape   # value
>

Enum cases:

  • shape_16x64b (SHAPE_16X64B)
  • shape_16x128b (SHAPE_16X128B)
  • shape_16x256b (SHAPE_16X256B)
  • shape_32x32b (SHAPE_32X32B)
  • shape_16x32bx2 (SHAPE_16X32BX2)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::Tcgen05LdStShapean enum of type Tcgen05LdStShape

Tcgen05WaitKindAttr 

NVVM Tcgen05 wait kind

Syntax:

#nvvm.tcgen05_wait<
  ::mlir::NVVM::Tcgen05WaitKind   # value
>

Enum cases:

  • load (LOAD)
  • store (STORE)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::Tcgen05WaitKindan enum of type Tcgen05WaitKind

WGMMAScaleInAttr 

WGMMA overflow options

Syntax:

#nvvm.wgmma_scale_in<
  ::mlir::NVVM::WGMMAScaleIn   # value
>

Enum cases:

  • one (one)
  • neg (neg)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::WGMMAScaleInan enum of type WGMMAScaleIn

WGMMAScaleOutAttr 

WGMMA input predicate

Syntax:

#nvvm.wgmma_scale_out<
  ::mlir::NVVM::WGMMAScaleOut   # value
>

Enum cases:

  • zero (zero)
  • one (one)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::WGMMAScaleOutan enum of type WGMMAScaleOut

WGMMATypesAttr 

NVVM WGMMA types

Syntax:

#nvvm.wgmma_type<
  ::mlir::NVVM::WGMMATypes   # value
>

Enum cases:

  • f16 (f16)
  • tf32 (tf32)
  • u8 (u8)
  • s8 (s8)
  • b1 (b1)
  • bf16 (bf16)
  • e4m3 (e4m3)
  • e5m2 (e5m2)
  • f32 (f32)
  • s32 (s32)

Parameters: 

ParameterC++ typeDescription
value::mlir::NVVM::WGMMATypesan enum of type WGMMATypes

Enums 

FPRoundingMode 

NVVM FPRoundingMode kind

Cases: 

SymbolValueString
NONE0none
RN1rn
RM2rm
RP3rp
RZ4rz
RNA5rna

LoadCacheModifierKind 

NVVM load cache modifier kind

Cases: 

SymbolValueString
CA0ca
CG1cg
CS2cs
LU3lu
CV4cv

MMAB1Op 

MMA binary operations

Cases: 

SymbolValueString
none0none
xor_popc1xor_popc
and_popc2and_popc

MMAFrag 

NVVM MMA frag type

Cases: 

SymbolValueString
a0a
b1b
c2c

MMAIntOverflow 

MMA overflow options

Cases: 

SymbolValueString
satfinite1satfinite
wrapped0wrapped

MMALayout 

NVVM MMA layout

Cases: 

SymbolValueString
row0row
col1col

MMATypes 

NVVM MMA types

Cases: 

SymbolValueString
f160f16
f321f32
tf322tf32
bf169bf16
s84s8
u83u8
s325s32
s48s4
u47u4
b16b1
f6410f64

MatchSyncKind 

NVVM match sync kind

Cases: 

SymbolValueString
any0any
all1all

MemScopeKind 

NVVM Memory Scope kind

Cases: 

SymbolValueString
CTA0cta
CLUSTER1cluster
GPU2gpu
SYS3sys

ProxyKind 

Proxy kind

Cases: 

SymbolValueString
alias0alias
async1async
async_global2async.global
async_shared3async.shared
TENSORMAP4tensormap
GENERIC5generic

ReduxKind 

NVVM redux kind

Cases: 

SymbolValueString
ADD1add
AND2and
MAX3max
MIN4min
OR5or
UMAX6umax
UMIN7umin
XOR8xor
FMIN9fmin
FMAX10fmax

SaturationMode 

NVVM SaturationMode kind

Cases: 

SymbolValueString
NONE0none
SATFINITE1satfinite

SetMaxRegisterAction 

NVVM set max register action

Cases: 

SymbolValueString
decrease1decrease
increase0increase

SharedSpace 

Shared memory space

Cases: 

SymbolValueString
shared_cta0cta
shared_cluster1cluster

ShflKind 

NVVM shuffle kind

Cases: 

SymbolValueString
bfly0bfly
up1up
down2down
idx3idx

TMAReduxKind 

NVVM TMA redux kind

Cases: 

SymbolValueString
ADD0add
MAX2max
MIN1min
INC3inc
DEC4dec
AND5and
OR6or
XOR7xor

TMAStoreMode 

NVVM TMA Store Mode

Cases: 

SymbolValueString
TILE0tile
IM2COL1im2col

Tcgen05CpMulticast 

Tcgen05 cp multicast

Cases: 

SymbolValueString
NONE0none
WARPX2_02_131warpx2_02_13
WARPX2_01_232warpx2_01_23
WARPX43warpx4

Tcgen05CpShape 

Tcgen05 cp shapes

Cases: 

SymbolValueString
SHAPE_128x256b0shape_128x256b
SHAPE_4x256b1shape_4x256b
SHAPE_128x128b2shape_128x128b
SHAPE_64x128b3shape_64x128b
SHAPE_32x128b4shape_32x128b

Tcgen05CpSrcFormat 

Tcgen05 cp source format

Cases: 

SymbolValueString
B6x16_P320b6x16_p32
B4x16_P641b4x16_p64

Tcgen05FenceKind 

NVVM Tcgen05 fence kind

Cases: 

SymbolValueString
BEFORE_THREAD_SYNC0before
AFTER_THREAD_SYNC1after

Tcgen05GroupKind 

NVVM Tcgen05 group kind

Cases: 

SymbolValueString
CTA_10cta_1
CTA_21cta_2

Tcgen05LdStShape 

Allowed 32-bit signless integer cases: 0, 1, 2, 3, 4

Cases: 

SymbolValueString
SHAPE_16X64B0shape_16x64b
SHAPE_16X128B1shape_16x128b
SHAPE_16X256B2shape_16x256b
SHAPE_32X32B3shape_32x32b
SHAPE_16X32BX24shape_16x32bx2

Tcgen05WaitKind 

NVVM Tcgen05 wait kind

Cases: 

SymbolValueString
LOAD0load
STORE1store

WGMMAScaleIn 

WGMMA overflow options

Cases: 

SymbolValueString
one1one
neg-1neg

WGMMAScaleOut 

WGMMA input predicate

Cases: 

SymbolValueString
zero0zero
one1one

WGMMATypes 

NVVM WGMMA types

Cases: 

SymbolValueString
f160f16
tf321tf32
u82u8
s83s8
b14b1
bf165bf16
e4m36e4m3
e5m27e5m2
f328f32
s329s32