'nvvm' Dialect

Operations ¶

`nvvm.bar.warp.sync` (NVVM::SyncWarpOp) ¶

Syntax:

operation ::= `nvvm.bar.warp.sync` $mask attr-dict `:` type($mask)

Operands: ¶

Operand	Description
`mask`	LLVM dialect-compatible type

`nvvm.barrier` (NVVM::BarrierOp) ¶

Syntax:

operation ::= `nvvm.barrier` (`id` `=` $barrierId^)? (`number_of_threads` `=` $numberOfThreads^)? attr-dict

Traits: AttrSizedOperandSegments

Operands: ¶

Operand	Description
`barrierId`	32-bit signless integer
`numberOfThreads`	32-bit signless integer

`nvvm.barrier.arrive` (NVVM::BarrierArriveOp) ¶

Syntax:

operation ::= `nvvm.barrier.arrive` (`id` `=` $barrierId^)? `number_of_threads` `=` $numberOfThreads attr-dict

Thread that executes this op announces their arrival at the barrier with given id and continue their execution.

The default barrier id is 0 that is similar to nvvm.barrier Op. When barrierId is not present, the default barrier id is used.

For more information, see PTX ISA

Interfaces: BasicPtxBuilderInterface

Operands: ¶

Operand	Description
`barrierId`	32-bit signless integer
`numberOfThreads`	32-bit signless integer

`nvvm.barrier0` (NVVM::Barrier0Op) ¶

Syntax:

operation ::= `nvvm.barrier0` attr-dict

`nvvm.breakpoint` (NVVM::Breakpoint) ¶

Breakpoint Op

Syntax:

operation ::= `nvvm.breakpoint` attr-dict

Breakpoint suspends execution of the program for debugging. For more information, see PTX ISA

`nvvm.cluster.arrive` (NVVM::ClusterArriveOp) ¶

Cluster Barrier Arrive Op

Syntax:

operation ::= `nvvm.cluster.arrive` attr-dict

The cluster.arrive can be used by the threads within the cluster for synchronization and communication. The cluster.arrive instruction marks the warps’ arrival at the barrier without causing the executing thread to wait for other participating threads.

The aligned attribute, when provided, generates the .aligned version of the PTX instruction.

For more information, see PTX ISA

Attributes: ¶

Attribute	MLIR Type	Description
`aligned`	::mlir::UnitAttr	unit attribute

`nvvm.cluster.arrive.relaxed` (NVVM::ClusterArriveRelaxedOp) ¶

Cluster Barrier Relaxed Arrive Op

Syntax:

operation ::= `nvvm.cluster.arrive.relaxed` attr-dict

The aligned attribute, when provided, generates the .aligned version of the PTX instruction. The .relaxed qualifier on cluster.arrive specifies that there are no memory ordering and visibility guarantees provided for the memory accesses performed prior to cluster.arrive.

For more information, see PTX ISA

Traits: NVVMRequiresSM<90>

Attributes: ¶

Attribute	MLIR Type	Description
`aligned`	::mlir::UnitAttr	unit attribute

`nvvm.cluster.wait` (NVVM::ClusterWaitOp) ¶

Cluster Barrier Wait Op

Syntax:

operation ::= `nvvm.cluster.wait` attr-dict

The cluster.wait causes the executing thread to wait for all non-exited threads of the cluster to perform cluster.arrive. The aligned attribute, when provided, generates the .aligned version of the PTX instruction.

For more information, see PTX ISA

Traits: NVVMRequiresSM<90>

Attributes: ¶

Attribute	MLIR Type	Description
`aligned`	::mlir::UnitAttr	unit attribute

`nvvm.convert.bf16x2.to.f8x2` (NVVM::ConvertBF16x2ToF8x2Op) ¶

Convert a pair of bf16 inputs to f8x2

Syntax:

operation ::= `nvvm.convert.bf16x2.to.f8x2` $type $a attr-dict `:` type($a) `->` type($dst)

This Op converts the given bf16 inputs in a bf16x2 vector to the specified f8 type. The result dst is represented as an i16 type or as a vector of two i8 types. If dst is returned as an i16 type, the converted values from a are packed such that the value converted from the first element of a is stored in the upper 8 bits of dst and the value converted from the second element of a is stored in the lower 8 bits of dst. If dst is returned as a vector type, each converted value is stored as an i8 element in the vector. The rnd and sat attributes specify the rounding and saturation modes respectively.

For more information, see PTX ISA

Attributes: ¶

Attribute	MLIR Type	Description
`type`	::mlir::NVVM::ConvertFP8TypeAttr	NVVM ConvertFP8Type kind
`rnd`	::mlir::NVVM::FPRoundingModeAttr	NVVM FPRoundingMode kind
`sat`	::mlir::NVVM::SaturationModeAttr	NVVM SaturationMode kind

Operands: ¶

Operand	Description
`a`	vector of bfloat16 type values of length 2

Results: ¶

Result	Description
`dst`	16-bit signless integer or vector of 8-bit signless integer values of length 2

`nvvm.convert.f16x2.to.f8x2` (NVVM::ConvertF16x2ToF8x2Op) ¶

Convert an f16x2 input to f8x2

Syntax:

operation ::= `nvvm.convert.f16x2.to.f8x2` $type $a attr-dict `:` type($a) `->` type($dst)

This Op converts the given f16 inputs in an f16x2 vector to the specified f8 type. The result dst is represented as an i16 type or as a vector of two i8 types. If dst is returned as an i16 type, the converted values from a are packed such that the value converted from the first element of a is stored in the upper 8 bits of dst and the value converted from the second element of a is stored in the lower 8 bits of dst. If dst is returned as a vector type, each converted value is stored as an i8 element in the vector. The relu attribute, when set, lowers to the ‘.relu’ variant of the cvt instruction.

For more information, see PTX ISA

Attributes: ¶

Attribute	MLIR Type	Description
`type`	::mlir::NVVM::ConvertFP8TypeAttr	NVVM ConvertFP8Type kind
`relu`	::mlir::BoolAttr	bool attribute

Operands: ¶

Operand	Description
`a`	vector of 16-bit float values of length 2

Results: ¶

Result	Description
`dst`	16-bit signless integer or vector of 8-bit signless integer values of length 2

`nvvm.convert.f32x2.to.f6x2` (NVVM::ConvertF32x2ToF6x2Op) ¶

Convert a pair of float inputs to f6x2

Syntax:

operation ::= `nvvm.convert.f32x2.to.f6x2` $type $a `,` $b attr-dict `:` type($dst)

This Op converts each of the given float inputs to the specified fp6 type. The result dst is represented either as an i16 type or as a vector of two i8 types. If dst is returned as an i16 type, the converted values are packed such that the value converted from a is stored in the upper 8 bits of dst with 2 MSB bits padded with zeros and the value converted from b is stored in the lower 8 bits of dst with 2 MSB bits padded with zeros. If dst is returned as a vector type, each converted value is stored as an i8 element in the vector. The relu attribute, when set, lowers to the ‘.relu’ variant of the cvt instruction.

For more information, see PTX ISA

Attributes: ¶

Attribute	MLIR Type	Description
`type`	::mlir::NVVM::ConvertFP6TypeAttr	NVVM ConvertFP6Type kind
`relu`	::mlir::BoolAttr	bool attribute

Operands: ¶

Operand	Description
`a`	32-bit float
`b`	32-bit float

Results: ¶

Result	Description
`dst`	16-bit signless integer or vector of 8-bit signless integer values of length 2

`nvvm.convert.f32x2.to.f8x2` (NVVM::ConvertF32x2ToF8x2Op) ¶

Convert a pair of float inputs to f8x2

Syntax:

operation ::= `nvvm.convert.f32x2.to.f8x2` $type $a `,` $b attr-dict `:` type($dst)

This Op converts each of the given float inputs to the specified fp8 type. The result dst is represented as an i16 type or as a vector of two i8 types. If dst is returned as an i16 type, the converted values are packed such that the value converted from a is stored in the upper 8 bits of dst and the value converted from b is stored in the lower 8 bits of dst. If dst is returned as a vector type, each converted value is stored as an i8 element in the vector. The rnd and sat attributes specify the rounding and saturation modes respectively. The relu attribute, when set, lowers to the ‘.relu’ variant of the cvt instruction.

For more information, see PTX ISA

Attributes: ¶

Attribute	MLIR Type	Description
`type`	::mlir::NVVM::ConvertFP8TypeAttr	NVVM ConvertFP8Type kind
`rnd`	::mlir::NVVM::FPRoundingModeAttr	NVVM FPRoundingMode kind
`sat`	::mlir::NVVM::SaturationModeAttr	NVVM SaturationMode kind
`relu`	::mlir::BoolAttr	bool attribute

Operands: ¶

Operand	Description
`a`	32-bit float
`b`	32-bit float

Results: ¶

Result	Description
`dst`	16-bit signless integer or vector of 8-bit signless integer values of length 2

`nvvm.convert.float.to.tf32` (NVVM::ConvertFloatToTF32Op) ¶

Convert the given float input to TF32

Syntax:

operation ::= `nvvm.convert.float.to.tf32` $src attr-dict

This Op converts the given f32 input to tf32. The result res is represented as an i32 type. The relu attribute, when set, lowers to the ‘.relu’ variant of the cvt instruction. The rnd and sat attributes specify the the rounding and saturation modes respectively.

For more information, see PTX ISA

Attributes: ¶

Attribute	MLIR Type	Description
`rnd`	::mlir::NVVM::FPRoundingModeAttr	NVVM FPRoundingMode kind
`sat`	::mlir::NVVM::SaturationModeAttr	NVVM SaturationMode kind
`relu`	::mlir::BoolAttr	bool attribute

Operands: ¶

Operand	Description
`src`	32-bit float

Results: ¶

Result	Description
`res`	32-bit signless integer

`nvvm.cp.async.bulk.commit.group` (NVVM::CpAsyncBulkCommitGroupOp) ¶

Syntax:

operation ::= `nvvm.cp.async.bulk.commit.group` attr-dict

This Op commits all prior initiated but uncommitted cp.async.bulk instructions into a cp.async.bulk-group.

For more information, see PTX ISA

`nvvm.cp.async.bulk.global.shared.cta` (NVVM::CpAsyncBulkSharedCTAToGlobalOp) ¶

Async bulk copy from Shared CTA memory to Global memory

Syntax:

operation ::= `nvvm.cp.async.bulk.global.shared.cta` $dstMem `,` $srcMem `,` $size
              (`l2_cache_hint` `=` $l2CacheHint^ )?
              (`byte_mask` `=` $byteMask^ )?
              attr-dict `:` type($dstMem) `,` type($srcMem)

Initiates an asynchronous copy operation from Shared CTA memory to global memory. The 32-bit operand size specifies the amount of memory to be copied, in terms of number of bytes. size must be a multiple of 16. The l2CacheHint operand is optional, and it is used to specify cache eviction policy that may be used during the memory access. The byteMask operand is optional. The i-th bit in the 16-bit wide byteMask specifies whether the i-th byte of each 16-byte wide chunk of source data is copied to the destination. If the bit is set, the byte is copied.

Example:

  nvvm.cp.async.bulk.global.shared.cta %dst, %src, %size
      : !llvm.ptr<1>, !llvm.ptr<3>

  // with l2_cache_hint
  nvvm.cp.async.bulk.global.shared.cta %dst, %src, %size l2_cache_hint = %ch
      : !llvm.ptr<1>, !llvm.ptr<3>

  // with byte_mask
  nvvm.cp.async.bulk.global.shared.cta %dst, %src, %size byte_mask = %mask
      : !llvm.ptr<1>, !llvm.ptr<3>

  // with both l2_cache_hint and byte_mask
  nvvm.cp.async.bulk.global.shared.cta %dst, %src, %size l2_cache_hint = %ch byte_mask = %mask
      : !llvm.ptr<1>, !llvm.ptr<3>

For more information, see PTX ISA

Traits: AttrSizedOperandSegments

Operands: ¶

Operand	Description
`dstMem`	LLVM pointer in address space 1
`srcMem`	LLVM pointer in address space 3
`size`	32-bit signless integer
`l2CacheHint`	64-bit signless integer
`byteMask`	16-bit signless integer

`nvvm.cp.async.bulk.prefetch` (NVVM::CpAsyncBulkPrefetchOp) ¶

Async bulk prefetch from global memory to L2 cache

Syntax:

operation ::= `nvvm.cp.async.bulk.prefetch` $srcMem `,` $size (`l2_cache_hint` `=` $l2CacheHint^ )?
              attr-dict  `:` type($srcMem)

Initiates an asynchronous prefetch of data from the location specified by srcMem to the L2 cache.

The l2CacheHint operand is optional, and it is used to specify cache eviction policy that may be used during the memory access.

Example:

  nvvm.cp.async.bulk.prefetch %src, %size : !llvm.ptr<1>

  // with l2_cache_hint
  nvvm.cp.async.bulk.prefetch %src, %size l2_cache_hint = %ch : !llvm.ptr<1>

For more information, see PTX ISA

Operands: ¶

Operand	Description
`srcMem`	LLVM pointer in address space 1
`size`	32-bit signless integer
`l2CacheHint`	64-bit signless integer

`nvvm.cp.async.bulk.shared.cluster.global` (NVVM::CpAsyncBulkGlobalToSharedClusterOp) ¶

Async bulk copy from global memory to Shared cluster memory

Syntax:

operation ::= `nvvm.cp.async.bulk.shared.cluster.global` $dstMem `,` $srcMem `,` $mbar `,` $size
              (`multicast_mask` `=` $multicastMask^ )?
              (`l2_cache_hint` `=` $l2CacheHint^ )?
              attr-dict  `:` type($dstMem) `,` type($srcMem)

Initiates an asynchronous copy operation from global memory to cluster’s shared memory.

The multicastMask operand is optional. When it is present, the Op copies data from global memory to shared memory of multiple CTAs in the cluster. Operand multicastMask specifies the destination CTAs in the cluster such that each bit position in the 16-bit multicastMask operand corresponds to the nvvm.read.ptx.sreg.ctaid of the destination CTA.

The l2CacheHint operand is optional, and it is used to specify cache eviction policy that may be used during the memory access.

For more information, see PTX ISA

Traits: AttrSizedOperandSegments

Operands: ¶

Operand	Description
`dstMem`	LLVM pointer in address space 7
`srcMem`	LLVM pointer in address space 1
`mbar`	LLVM pointer in address space 3
`size`	32-bit signless integer
`multicastMask`	16-bit signless integer
`l2CacheHint`	64-bit signless integer

`nvvm.cp.async.bulk.shared.cluster.shared.cta` (NVVM::CpAsyncBulkSharedCTAToSharedClusterOp) ¶

Async bulk copy from Shared CTA memory to Shared cluster memory

Syntax:

operation ::= `nvvm.cp.async.bulk.shared.cluster.shared.cta` $dstMem `,` $srcMem `,` $mbar `,` $size
              attr-dict  `:` type($dstMem) `,` type($srcMem)

Initiates an asynchronous copy operation from Shared CTA memory to Shared cluster memory.

For more information, see PTX ISA

Operands: ¶

Operand	Description
`dstMem`	LLVM pointer in address space 7
`srcMem`	LLVM pointer in address space 3
`mbar`	LLVM pointer in address space 3
`size`	32-bit signless integer

`nvvm.cp.async.bulk.tensor.global.shared.cta` (NVVM::CpAsyncBulkTensorSharedCTAToGlobalOp) ¶

Syntax:

operation ::= `nvvm.cp.async.bulk.tensor.global.shared.cta` $tmaDescriptor `,`
              $srcMem `,`
              `box` `[`$coordinates `]`
              (`,` `predicate` `=` $predicate^)?
              attr-dict  `:` type(operands)

Traits: AttrSizedOperandSegments

Interfaces: BasicPtxBuilderInterface

Operands: ¶

Operand	Description
`tmaDescriptor`	LLVM pointer type
`srcMem`	LLVM pointer in address space 3
`coordinates`	variadic of 32-bit signless integer
`predicate`	1-bit signless integer

`nvvm.cp.async.bulk.tensor.prefetch` (NVVM::CpAsyncBulkTensorPrefetchOp) ¶

Syntax:

operation ::= `nvvm.cp.async.bulk.tensor.prefetch` $tmaDescriptor `,`
              `box` `[`$coordinates `]`
              (`im2col` `[` $im2colOffsets^ `]` )?
              (`l2_cache_hint` `=` $l2CacheHint^ )?
              attr-dict  `:` type($tmaDescriptor)

Initiates an asynchronous prefetch operation on the tensor data from global memory to L2 cache.

The Op has two modes:

Tiled Mode: It’s the default mode. The source multi-dimensional tensor layout is preserved at the destination.
Im2col Mode: This mode is used when im2colOffsets operands are present. the elements in the Bounding Box of the source tensor are rearranged into columns at the destination. In this mode, the tensor has to be at least 3-dimensional.

The l2CacheHint operand is optional, and it is used to specify cache eviction policy that may be used during the memory access.

For more information, see PTX ISA

Traits: AttrSizedOperandSegments

Operands: ¶

Operand	Description
`tmaDescriptor`	LLVM pointer type
`coordinates`	variadic of 32-bit signless integer
`im2colOffsets`	variadic of 16-bit signless integer
`l2CacheHint`	64-bit signless integer

`nvvm.cp.async.bulk.tensor.reduce` (NVVM::CpAsyncBulkTensorReduceOp) ¶

Syntax:

operation ::= `nvvm.cp.async.bulk.tensor.reduce` $tmaDescriptor `,`
              $srcMem `,`
              `box` `[`$coordinates `]`
              (`l2_cache_hint` `=` $l2CacheHint^ )?
              attr-dict  `:` type($tmaDescriptor) `,` type($srcMem)

Initiates an asynchronous reduction operation of tensor data in global memory with tensor data in shared memory.

The mode attribute indicates whether the copy mode is tile or im2col. The redOp attribute specifies the reduction operations applied. The supported reduction operations are: {add, min, max, inc, dec, and, or, xor}

The l2CacheHint operand is optional, and it is used to specify cache eviction policy that may be used during the memory access.

For more information, see PTX ISA

Traits: AttrSizedOperandSegments

Attributes: ¶

Attribute	MLIR Type	Description
`redKind`	::mlir::NVVM::TMAReduxKindAttr	NVVM TMA redux kind
`mode`	::mlir::NVVM::TMAStoreModeAttr	NVVM TMA Store Mode

Operands: ¶

Operand	Description
`tmaDescriptor`	LLVM pointer type
`srcMem`	LLVM pointer in address space 3
`coordinates`	variadic of 32-bit signless integer
`l2CacheHint`	64-bit signless integer

`nvvm.cp.async.bulk.tensor.shared.cluster.global` (NVVM::CpAsyncBulkTensorGlobalToSharedClusterOp) ¶

Syntax:

operation ::= `nvvm.cp.async.bulk.tensor.shared.cluster.global` $dstMem `,`
              $tmaDescriptor `,`
              $mbar `,`
              `box` `[`$coordinates `]`
              (`im2col` `[` $im2colOffsets^ `]` )?
              (`multicast_mask` `=` $multicastMask^ )?
              (`l2_cache_hint` `=` $l2CacheHint^ )?
              (`predicate` `=` $predicate^)?
              attr-dict  `:` type($dstMem) `,` type($tmaDescriptor)

Initiates an asynchronous copy operation on the tensor data from global memory to shared memory.

The Op operates has two load modes:

Tiled Mode: It’s the default mode. The source multi-dimensional tensor layout is preserved at the destination.
Im2col Mode: This mode is used when im2colOffsets operands are present. the elements in the Bounding Box of the source tensor are rearranged into columns at the destination. In this mode, the tensor has to be at least 3-dimensional.

The l2CacheHint operand is optional, and it is used to specify cache eviction policy that may be used during the memory access.

For more information, see PTX ISA

Traits: AttrSizedOperandSegments, NVVMRequiresSM<90>

Interfaces: BasicPtxBuilderInterface

Operands: ¶

Operand	Description
`dstMem`	LLVM pointer in address space 3
`tmaDescriptor`	LLVM pointer type
`coordinates`	variadic of 32-bit signless integer
`mbar`	LLVM pointer in address space 3
`im2colOffsets`	variadic of 16-bit signless integer
`multicastMask`	16-bit signless integer
`l2CacheHint`	64-bit signless integer
`predicate`	1-bit signless integer

`nvvm.cp.async.bulk.wait_group` (NVVM::CpAsyncBulkWaitGroupOp) ¶

Syntax:

operation ::= `nvvm.cp.async.bulk.wait_group` $group attr-dict

Op waits for completion of the most recent bulk async-groups.

The $group operand tells waiting has to be done until for $group or fewer of the most recent bulk async-groups. If $group is 0, the op wait until all the most recent bulk async-groups have completed.

The $read indicates that the waiting has to be done until all the bulk async operations in the specified bulk async-group have completed reading from their source locations.

For more information, see PTX ISA

Traits: NVVMRequiresSM<90>

Attributes: ¶

Attribute	MLIR Type	Description
`group`	::mlir::IntegerAttr	32-bit signless integer attribute whose minimum value is 0
`read`	::mlir::UnitAttr	unit attribute

`nvvm.cp.async.commit.group` (NVVM::CpAsyncCommitGroupOp) ¶

Syntax:

operation ::= `nvvm.cp.async.commit.group` attr-dict

`nvvm.cp.async.mbarrier.arrive` (NVVM::CpAsyncMBarrierArriveOp) ¶

NVVM Dialect Op for cp.async.mbarrier.arrive

Syntax:

operation ::= `nvvm.cp.async.mbarrier.arrive` $addr attr-dict `:` type(operands)

The cp.async.mbarrier.arrive Op makes the mbarrier object track all prior cp.async operations initiated by the executing thread. The addr operand specifies the address of the mbarrier object in generic address space. The noinc attr impacts how the mbarrier’s state is updated.

For more information, see PTX ISA

Attributes: ¶

Attribute	MLIR Type	Description
`noinc`	::mlir::IntegerAttr	1-bit signless integer attribute

Operands: ¶

Operand	Description
`addr`	LLVM pointer type

`nvvm.cp.async.mbarrier.arrive.shared` (NVVM::CpAsyncMBarrierArriveSharedOp) ¶

NVVM Dialect Op for cp.async.mbarrier.arrive.shared

Syntax:

operation ::= `nvvm.cp.async.mbarrier.arrive.shared` $addr attr-dict `:` type(operands)

The cp.async.mbarrier.arrive.shared Op makes the mbarrier object track all prior cp.async operations initiated by the executing thread. The addr operand specifies the address of the mbarrier object in shared memory. The noinc attr impacts how the mbarrier’s state is updated.

For more information, see PTX ISA

Attributes: ¶

Attribute	MLIR Type	Description
`noinc`	::mlir::IntegerAttr	1-bit signless integer attribute

Operands: ¶

Operand	Description
`addr`	LLVM pointer in address space 3

`nvvm.cp.async.shared.global` (NVVM::CpAsyncOp) ¶

Syntax:

operation ::= `nvvm.cp.async.shared.global` $dst `,` $src `,` $size `,` `cache` `=` $modifier (`,` $cpSize^)? attr-dict `:` type(operands)

Attributes: ¶

Attribute	MLIR Type	Description
`size`	::mlir::IntegerAttr	32-bit signless integer attribute
`modifier`	::mlir::NVVM::LoadCacheModifierKindAttr	NVVM load cache modifier kind `Enum attribute of the different kinds of cache operators for load instructions.` `For more information, see PTX ISA`

Operands: ¶

Operand	Description
`dst`	LLVM pointer in address space 3
`src`	LLVM pointer in address space 1
`cpSize`	LLVM dialect-compatible type

`nvvm.cp.async.wait.group` (NVVM::CpAsyncWaitGroupOp) ¶

Syntax:

operation ::= `nvvm.cp.async.wait.group` $n attr-dict

Attributes: ¶

Attribute	MLIR Type	Description
`n`	::mlir::IntegerAttr	32-bit signless integer attribute

`nvvm.dot.accumulate.2way` (NVVM::DotAccumulate2WayOp) ¶

Two-way 16-bit to 8-bit dot product-accumulate instruction

Syntax:

operation ::= `nvvm.dot.accumulate.2way` $a $a_type `,` $b $b_type `,` $c attr-dict `:` type($a) `,` type($b)

Performs a two-way 16-bit to 8-bit dot-product which is accumulated in a 32-bit result. Operand a is a vector of two 16-bit elements and operand b a vector of four 8-bit elements between which the dot product is computed.

The a_type and b_type attributes specify the type of the elements in a and b respectively. If a_type or b_type is s, then the elements in the corresponding vector are sign-extended to 32-bit before the dot product is computed. If a_type or b_type is u, then the elements in the corresponding vector are zero-extended to 32-bit instead.

The b_hi boolean attribute specifies which two bytes of b are used for the dot product. If b_hi is true, then the dot product is computed between a and elements at indices 2 and 3 of b. If b_hi is false, then the dot product is computed between a and elements at indices 0 and 1 of b.

Operand c is a 32-bit integer to which the result is accumulated. It is treated as holding a signed integer if any of a_type or b_type is signed.

For more information, see PTX ISA

Attributes: ¶

Attribute	MLIR Type	Description
`a_type`	::mlir::NVVM::DotAccumulateTypeAttr	NVVM DotAccumulateType
`b_type`	::mlir::NVVM::DotAccumulateTypeAttr	NVVM DotAccumulateType
`b_hi`	::mlir::BoolAttr	bool attribute

Operands: ¶

Operand	Description
`a`	vector of 16-bit signless integer values of length 2
`b`	vector of 8-bit signless integer values of length 4
`c`	32-bit signless integer

Results: ¶

Result	Description
`res`	32-bit signless integer

`nvvm.dot.accumulate.4way` (NVVM::DotAccumulate4WayOp) ¶

Four-way byte dot product-accumulate instruction

Syntax:

operation ::= `nvvm.dot.accumulate.4way` $a $a_type `,` $b $b_type `,` $c attr-dict `:` type($a) `,` type($b)

Performs a four-way byte dot-product which is accumulated in a 32-bit result. Operand a and b are vectors of 4 bytes between which the dot product is computed.

The a_type and b_type attributes specify the type of the elements in a and b respectively. If a_type or b_type is signed, then the elements in the corresponding vector are sign-extended to 32-bit before the dot product is computed. If a_type or b_type is unsigned, then the elements in the corresponding vector are zero-extended to 32-bit instead.

Operand c is a 32-bit integer to which the result is accumulated. It is treated as holding a signed integer if any of a_type or b_type is s8.

For more information, see PTX ISA

Attributes: ¶

Attribute	MLIR Type	Description
`a_type`	::mlir::NVVM::DotAccumulateTypeAttr	NVVM DotAccumulateType
`b_type`	::mlir::NVVM::DotAccumulateTypeAttr	NVVM DotAccumulateType

Operands: ¶

Operand	Description
`a`	vector of 8-bit signless integer values of length 4
`b`	vector of 8-bit signless integer values of length 4
`c`	32-bit signless integer

Results: ¶

Result	Description
`res`	32-bit signless integer

`nvvm.elect.sync` (NVVM::ElectSyncOp) ¶

Elect one leader thread

Syntax:

operation ::= `nvvm.elect.sync` attr-dict `->` type(results)

The elect.sync instruction elects one predicated active leader thread from among a set of threads specified in membermask. The membermask is set to 0xFFFFFFFF for the current version of this Op. The predicate result is set to True for the leader thread, and False for all other threads.

For more information, see PTX ISA

Results: ¶

Result	Description
`pred`	1-bit signless integer

`nvvm.exit` (NVVM::Exit) ¶

Exit Op

Syntax:

operation ::= `nvvm.exit` attr-dict

Ends execution of a thread. For more information, see PTX ISA

`nvvm.fence.mbarrier.init` (NVVM::FenceMbarrierInitOp) ¶

Syntax:

operation ::= `nvvm.fence.mbarrier.init` attr-dict

Fence operation that applies on the prior nvvm.mbarrier.init

For more information, see PTX ISA

Interfaces: BasicPtxBuilderInterface

`nvvm.fence.proxy` (NVVM::FenceProxyOp) ¶

Syntax:

operation ::= `nvvm.fence.proxy` attr-dict

Fence operation with proxy to establish an ordering between memory accesses that may happen through different proxies.

For more information, see PTX ISA

Interfaces: BasicPtxBuilderInterface

Attributes: ¶

Attribute	MLIR Type	Description
`kind`	::mlir::NVVM::ProxyKindAttr	Proxy kind
`space`	::mlir::NVVM::SharedSpaceAttr	Shared memory space

`nvvm.fence.proxy.acquire` (NVVM::FenceProxyAcquireOp) ¶

Uni-directional proxy fence operation with acquire semantics

Syntax:

operation ::= `nvvm.fence.proxy.acquire` $scope $addr `,` $size (`from_proxy` `=` $fromProxy^)? (`to_proxy` `=` $toProxy^)? attr-dict

fence.proxy.acquire is a uni-directional fence used to establish ordering between a prior memory access performed via the generic proxy and a subsequent memory access performed via the tensormap proxy

The address operand addr and the operand size together specify the memory range [addr, addr+size) on which the ordering guarantees on the memory accesses across the proxies is to be provided. The only supported value for the size operand is 128 and must be an immediate. Generic Addressing is used unconditionally, and the address specified by the operand addr must fall within the .global state space. Otherwise, the behavior is undefined

For more information, see PTX ISA

Attributes: ¶

Attribute	MLIR Type	Description
`scope`	::mlir::NVVM::MemScopeKindAttr	NVVM Memory Scope kind
`fromProxy`	::mlir::NVVM::ProxyKindAttr	Proxy kind
`toProxy`	::mlir::NVVM::ProxyKindAttr	Proxy kind

Operands: ¶

Operand	Description
`addr`	LLVM pointer in address space 0
`size`	32-bit signless integer

`nvvm.fence.proxy.release` (NVVM::FenceProxyReleaseOp) ¶

Uni-directional proxy fence operation with release semantics

Syntax:

operation ::= `nvvm.fence.proxy.release` $scope (`from_proxy` `=` $fromProxy^)? (`to_proxy` `=` $toProxy^)? attr-dict

fence.proxy.release is a uni-directional fence used to establish ordering between a prior memory access performed via the generic proxy and a subsequent memory access performed via the tensormap proxy. fence.proxy.release operation can form a release sequence that synchronizes with an acquire sequence that contains the fence.proxy.acquire proxy fence operation

For more information, see PTX ISA

Attributes: ¶

Attribute	MLIR Type	Description
`scope`	::mlir::NVVM::MemScopeKindAttr	NVVM Memory Scope kind
`fromProxy`	::mlir::NVVM::ProxyKindAttr	Proxy kind
`toProxy`	::mlir::NVVM::ProxyKindAttr	Proxy kind

`nvvm.fence.sc.cluster` (NVVM::FenceScClusterOp) ¶

Syntax:

operation ::= `nvvm.fence.sc.cluster` attr-dict

`nvvm.griddepcontrol.launch.dependents` (NVVM::GriddepcontrolLaunchDependentsOp) ¶

Syntax:

operation ::= `nvvm.griddepcontrol.launch.dependents` attr-dict

Signals that specific dependents the runtime system designated to react to this instruction can be scheduled as soon as all other CTAs in the grid issue the same instruction or have completed.

For more information, see PTX ISA

`nvvm.griddepcontrol.wait` (NVVM::GriddepcontrolWaitOp) ¶

Syntax:

operation ::= `nvvm.griddepcontrol.wait` attr-dict

Causes the executing thread to wait until all prerequisite grids in flight have completed and all the memory operations from the prerequisite grids are performed and made visible to the current grid.

For more information, see PTX ISA

`nvvm.inline_ptx` (NVVM::InlinePtxOp) ¶

Inline PTX Op

Syntax:

operation ::= `nvvm.inline_ptx` $ptxCode `(` $readOnlyArgs `)`
              (`,` `predicate` `=` $predicate^)? attr-dict
              `:` type(operands)
              (`->` type($writeOnlyArgs)^)?

This op allows using PTX directly within the NVVM dialect, while greatly simplifying llvm.inline_asm generation. It automatically handles register size selection and sets the correct read/write access for each operand. The operation leverages the BasicPtxBuilderInterface to abstract away low-level details of PTX assembly formatting.

The `predicate` attribute is used to specify a predicate for the 
PTX instruction.

Example 1: Read-only Parameters
```mlir
nvvm.inline_ptx "mbarrier.init.b64 [$0], $1;" (%barrier_gen, %count) : !llvm.ptr, i32

// Lowers to:
llvm.inline_asm has_side_effects asm_dialect = att 
  "mbarrier.init.b64 [$0], $1;", "l,r" %arg0, %arg2 : (!llvm.ptr, i32) -> ()
```

Example 2: Read-only and Write-only Parameters
```mlir
%0 = nvvm.inline_ptx "ex2.approx.ftz.f32 $0, $1;" (%input) : f32 -> f32

// Lowers to:
%0 = llvm.inline_asm has_side_effects asm_dialect = att 
  "ex2.approx.ftz.f32 $0, $1;", "=f,f" %arg0 : (f32) -> f32
```

Example 3: Predicate Usage
```mlir
nvvm.inline_ptx "mbarrier.init.b64 [$0], $1;" (%barrier_gen, %count), 
  predicate = %pred : !llvm.ptr, i32, i1

// Lowers to:
llvm.inline_asm has_side_effects asm_dialect = att 
  "@$2 mbarrier.init.b64 [$0], $1;", "l,r,b" %arg0, %arg2, %arg3 
  : (!llvm.ptr, i32, i1) -> ()
```

Traits: AttrSizedOperandSegments

Interfaces: BasicPtxBuilderInterface

Attributes: ¶

Attribute	MLIR Type	Description
`ptxCode`	::mlir::StringAttr	string attribute

Operands: ¶

Operand	Description
`readOnlyArgs`	variadic of any type
`predicate`	1-bit signless integer

Results: ¶

Result	Description
`writeOnlyArgs`	variadic of any type

`nvvm.ldmatrix` (NVVM::LdMatrixOp) ¶

Cooperative matrix load

Syntax:

operation ::= `nvvm.ldmatrix` $ptr attr-dict `:` functional-type($ptr, $res)

Attributes: ¶

Attribute	MLIR Type	Description
`num`	::mlir::IntegerAttr	32-bit signless integer attribute
`layout`	::mlir::NVVM::MMALayoutAttr	NVVM MMA layout

Operands: ¶

Operand	Description
`ptr`	LLVM pointer type

Results: ¶

Result	Description
`res`	any type

`nvvm.mapa` (NVVM::MapaOp) ¶

Syntax:

operation ::= `nvvm.mapa` $a`,` $b attr-dict `:` type($a) `->` type($res)

Traits: NVVMRequiresSM<90>

Operands: ¶

Operand	Description
`a`	LLVM pointer in address space 0 or LLVM pointer in address space 3
`b`	32-bit signless integer

Results: ¶

Result	Description
`res`	LLVM pointer in address space 0 or LLVM pointer in address space 3

`nvvm.match.sync` (NVVM::MatchSyncOp) ¶

Broadcast and compare a value across threads in warp

Syntax:

operation ::= `nvvm.match.sync` $kind $thread_mask `,` $val attr-dict `:` type($val) `->` type($res)

The match.sync op performs broadcast and compare of operand val across all non-exited threads in thread_mask and returns a mask depending on the kind and an optional predicate.

The matching operation kinds are:

any: Returns a mask corresponding to the non-exited threads in the thread_mask that have the same value of operand val.
all: Returns a mask and a predicate. If all non-exited threads in the thread_mask have the same value of operand val, the predicate is set to true and the mask corresponds to the non-exited threads in the thread_mask. Otherwise, the predicate is set to false and the mask is 0.

For more information, see PTX ISA

Attributes: ¶

Attribute	MLIR Type	Description
`kind`	::mlir::NVVM::MatchSyncKindAttr	NVVM match sync kind

Operands: ¶

Operand	Description
`thread_mask`	32-bit signless integer
`val`	32-bit signless integer or 64-bit signless integer

Results: ¶

Result	Description
`res`	32-bit signless integer or LLVM struct type

`nvvm.mbarrier.arrive` (NVVM::MBarrierArriveOp) ¶

Syntax:

operation ::= `nvvm.mbarrier.arrive` $addr attr-dict `:` type($addr) `->` type($res)

Operands: ¶

Operand	Description
`addr`	LLVM pointer type

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.mbarrier.arrive.expect_tx` (NVVM::MBarrierArriveExpectTxOp) ¶

Syntax:

operation ::= `nvvm.mbarrier.arrive.expect_tx` $addr `,` $txcount (`,` `predicate` `=` $predicate^)? attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: ¶

Operand	Description
`addr`	LLVM pointer type
`txcount`	32-bit signless integer
`predicate`	1-bit signless integer

`nvvm.mbarrier.arrive.expect_tx.shared` (NVVM::MBarrierArriveExpectTxSharedOp) ¶

Syntax:

operation ::= `nvvm.mbarrier.arrive.expect_tx.shared` $addr `,` $txcount (`,` `predicate` `=` $predicate^)? attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: ¶

Operand	Description
`addr`	LLVM pointer in address space 3
`txcount`	32-bit signless integer
`predicate`	1-bit signless integer

`nvvm.mbarrier.arrive.nocomplete` (NVVM::MBarrierArriveNocompleteOp) ¶

Syntax:

operation ::= `nvvm.mbarrier.arrive.nocomplete` $addr `,` $count attr-dict `:` type(operands) `->` type($res)

Operands: ¶

Operand	Description
`addr`	LLVM pointer type
`count`	32-bit signless integer

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.mbarrier.arrive.nocomplete.shared` (NVVM::MBarrierArriveNocompleteSharedOp) ¶

Syntax:

operation ::= `nvvm.mbarrier.arrive.nocomplete.shared` $addr `,` $count attr-dict `:` type(operands) `->` type($res)

Operands: ¶

Operand	Description
`addr`	LLVM pointer in address space 3
`count`	32-bit signless integer

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.mbarrier.arrive.shared` (NVVM::MBarrierArriveSharedOp) ¶

Syntax:

operation ::= `nvvm.mbarrier.arrive.shared` $addr attr-dict `:` qualified(type($addr)) `->` type($res)

Operands: ¶

Operand	Description
`addr`	LLVM pointer in address space 3

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.mbarrier.init` (NVVM::MBarrierInitOp) ¶

Syntax:

operation ::= `nvvm.mbarrier.init` $addr `,` $count (`,` `predicate` `=` $predicate^)? attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: ¶

Operand	Description
`addr`	LLVM pointer type
`count`	32-bit signless integer
`predicate`	1-bit signless integer

`nvvm.mbarrier.init.shared` (NVVM::MBarrierInitSharedOp) ¶

Syntax:

operation ::= `nvvm.mbarrier.init.shared` $addr `,` $count (`,` `predicate` `=` $predicate^)? attr-dict `:` type(operands)

Traits: NVVMRequiresSM<80>

Interfaces: BasicPtxBuilderInterface

Operands: ¶

Operand	Description
`addr`	LLVM pointer in address space 3
`count`	32-bit signless integer
`predicate`	1-bit signless integer

`nvvm.mbarrier.inval` (NVVM::MBarrierInvalOp) ¶

Syntax:

operation ::= `nvvm.mbarrier.inval` $addr attr-dict `:` type(operands)

Operands: ¶

Operand	Description
`addr`	LLVM pointer type

`nvvm.mbarrier.inval.shared` (NVVM::MBarrierInvalSharedOp) ¶

Syntax:

operation ::= `nvvm.mbarrier.inval.shared` $addr attr-dict `:` type(operands)

Operands: ¶

Operand	Description
`addr`	LLVM pointer in address space 3

`nvvm.mbarrier.test.wait` (NVVM::MBarrierTestWaitOp) ¶

Syntax:

operation ::= `nvvm.mbarrier.test.wait` $addr `,` $state attr-dict `:` type(operands) `->` type($res)

Operands: ¶

Operand	Description
`addr`	LLVM pointer type
`state`	LLVM dialect-compatible type

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.mbarrier.test.wait.shared` (NVVM::MBarrierTestWaitSharedOp) ¶

Syntax:

operation ::= `nvvm.mbarrier.test.wait.shared` $addr `,` $state attr-dict `:` type(operands) `->` type($res)

Operands: ¶

Operand	Description
`addr`	LLVM pointer in address space 3
`state`	LLVM dialect-compatible type

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.mbarrier.try_wait.parity` (NVVM::MBarrierTryWaitParityOp) ¶

Syntax:

operation ::= `nvvm.mbarrier.try_wait.parity` $addr `,` $phase `,` $ticks attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: ¶

Operand	Description
`addr`	LLVM pointer type
`phase`	32-bit signless integer
`ticks`	32-bit signless integer

`nvvm.mbarrier.try_wait.parity.shared` (NVVM::MBarrierTryWaitParitySharedOp) ¶

Syntax:

operation ::= `nvvm.mbarrier.try_wait.parity.shared` $addr `,` $phase `,` $ticks attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: ¶

Operand	Description
`addr`	LLVM pointer in address space 3
`phase`	32-bit signless integer
`ticks`	32-bit signless integer

`nvvm.mma.sync` (NVVM::MmaOp) ¶

Cooperative matrix-multiply and accumulate

The nvvm.mma.sync operation collectively performs the operation D = matmul(A, B) + C using all threads in a warp.

All the threads in the warp must execute the same mma.sync operation.

For each possible multiplicand PTX data type, there are one or more possible instruction shapes given as “mMnNkK”. The below table describes the posssibilities as well as the types required for the operands. Note that the data type for C (the accumulator) and D (the result) can vary independently when there are multiple possibilities in the “C/D Type” column.

When an optional attribute cannot be immediately inferred from the types of the operands and the result during parsing or validation, an error will be raised.

b1Op is only relevant when the binary (b1) type is given to multiplicandDataType. It specifies how the multiply-and-acumulate is performed and is either xor_popc or and_poc. The default is xor_popc.

intOverflowBehavior is only relevant when the multiplicandType attribute is one of u8, s8, u4, s4, this attribute describes how overflow is handled in the accumulator. When the attribute is satfinite, the accumulator values are clamped in the int32 range on overflow. This is the default behavior. Alternatively, accumulator behavior wrapped can also be specified, in which case overflow wraps from one end of the range to the other.

layoutA and layoutB are required and should generally be set to #nvvm.mma_layout<row> and #nvvm.mma_layout<col> respectively, but other combinations are possible for certain layouts according to the table below.

| A/B Type | Shape     | ALayout | BLayout | A Type   | B Type   | C/D Type          |
|----------|-----------|---------|---------|----------|----------|-------------------|
| f64      | .m8n8k4   | row     | col     | 1x f64   | 1x f64   | 2x f64            |
| f16      | .m8n8k4   | row/col | row/col | 2x f16x2 | 2x f16x2 | 4x f16x2 or 8xf32 |
|          | .m16n8k8  | row     | col     | 2x f16x2 | 1x f16x2 | 2x f16x2 or 4 f32 |
|          | .m16n8k16 | row     | col     | 4x f16x2 | 2x f16x2 | 2x f16x2 or 4 f32 |
| bf16     | .m16n8k8  | row     | col     | 2x i32   | 1x i32   | 4x f32            |
|          | .m16n8k16 | row     | col     | 4x i32   | 2x i32   | 4x f32            |
| tf32     | .m16n8k4  | row     | col     | 2x i32   | 1x i32   | 4x f32            |
|          | .m16n8k8  | row     | col     | 4x i32   | 2x i32   | 2x f16x2 or 4 f32 |
| u8/s8    | .m8n8k16  | row     | col     | 1x i32   | 1x i32   | 2x i32            |
|          | .m16n8k16 | row     | col     | 2x i32   | 1x i32   | 4x i32            |
|          | .m16n8k32 | row     | col     | 4x i32   | 2x i32   | 4x i32            |
| u4/s4    | .m8n8k32  | row     | col     | 1x i32   | 1x i32   | 2x i32            |
|          | m16n8k32  | row     | col     | 2x i32   | 1x i32   | 4x i32            |
|          | m16n8k64  | row     | col     | 4x i32   | 2x i32   | 4x i32            |
| b1       | m8n8k128  | row     | col     | 1x i32   | 1x i32   | 2x i32            |
|          | m16n8k128 | row     | col     | 2x i32   | 1x i32   | 4x i32            |

Example:


%128 = nvvm.mma.sync A[%120, %121, %122, %123]
                     B[%124, %125]
                     C[%126, %127]
                     {layoutA = #nvvm.mma_layout<row>,
                      layoutB = #nvvm.mma_layout<col>,
                      shape = {k = 16 : i32, m = 16 : i32, n = 8 : i32}}
    : (vector<2xf16>, vector<2xf16>, vector<2xf16>)
       -> !llvm.struct<(vector<2xf16>, vector<2xf16>)>

Traits: AttrSizedOperandSegments

Attributes: ¶

Attribute	MLIR Type	Description
`shape`	::mlir::NVVM::MMAShapeAttr	Attribute for MMA operation shape.
`b1Op`	::mlir::NVVM::MMAB1OpAttr	MMA binary operations
`intOverflowBehavior`	::mlir::NVVM::MMAIntOverflowAttr	MMA overflow options
`layoutA`	::mlir::NVVM::MMALayoutAttr	NVVM MMA layout
`layoutB`	::mlir::NVVM::MMALayoutAttr	NVVM MMA layout
`multiplicandAPtxType`	::mlir::NVVM::MMATypesAttr	NVVM MMA types
`multiplicandBPtxType`	::mlir::NVVM::MMATypesAttr	NVVM MMA types

Operands: ¶

Operand	Description
`operandA`	variadic of LLVM dialect-compatible type
`operandB`	variadic of LLVM dialect-compatible type
`operandC`	variadic of LLVM dialect-compatible type

Results: ¶

Result	Description
`res`	LLVM structure type

`nvvm.prefetch` (NVVM::PrefetchOp) ¶

Brings the cache line containing an address into the specified cache level

Syntax:

operation ::= `nvvm.prefetch` `level` `=` $cacheLevel (`uniform` $uniform^)? `,` $addr (`,` `evict_priority` `=` $evictPriority^)? attr-dict `:` type($addr)

Operand addr can be a global, local or generic address pointer. No operation is performed if addr maps to a shared memory location.

The cacheLevel attribute specifies the cache level to which the cache line containing the specified address is brought.

uniform can be specified after the cacheLevel to indicate that the prefetch is performed to the specified uniform cache level. If uniform is specified, addr must be a generic address pointer and no operation is performed if addr maps to a const, local, or shared memory location.

The evictPriority attribute is optional and specifies the cache eviction priority when cacheLevel is L2.

For more information, see PTX ISA

Attributes: ¶

Attribute	MLIR Type	Description
`cacheLevel`	::mlir::NVVM::PrefetchCacheLevelAttr	NVVM Prefetch Cache Level Enum cases: L1 (`L1`) L2 (`L2`)
`uniform`	::mlir::UnitAttr	unit attribute
`evictPriority`	::mlir::NVVM::CacheEvictionPriorityAttr	NVVM Cache Eviction Priority Enum cases: evict_normal (`EvictNormal`) evict_first (`EvictFirst`) evict_last (`EvictLast`) evict_unchanged (`EvictUnchanged`) no_allocate (`NoAllocate`)

Operands: ¶

Operand	Description
`addr`	LLVM pointer in address space 1 or LLVM pointer in address space 5 or LLVM pointer in address space 0

`nvvm.prefetch.tensormap` (NVVM::PrefetchTensorMapOp) ¶

Syntax:

operation ::= `nvvm.prefetch.tensormap` $tmaDescriptor (`,` `predicate` `=` $predicate^)? attr-dict `:` type(operands)

Interfaces: BasicPtxBuilderInterface

Operands: ¶

Operand	Description
`tmaDescriptor`	LLVM pointer type
`predicate`	1-bit signless integer

`nvvm.rcp.approx.ftz.f` (NVVM::RcpApproxFtzF32Op) ¶

Syntax:

operation ::= `nvvm.rcp.approx.ftz.f` $arg attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: ¶

Operand	Description
`arg`	32-bit float

Results: ¶

Result	Description
`res`	32-bit float

`nvvm.read.ptx.sreg.clock` (NVVM::ClockOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.clock` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.clock64` (NVVM::Clock64Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.clock64` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.cluster.ctaid.x` (NVVM::BlockInClusterIdXOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.ctaid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait, NVVMRequiresSM<90>

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.cluster.ctaid.y` (NVVM::BlockInClusterIdYOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.ctaid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait, NVVMRequiresSM<90>

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.cluster.ctaid.z` (NVVM::BlockInClusterIdZOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.ctaid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait, NVVMRequiresSM<90>

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.cluster.ctarank` (NVVM::ClusterId) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.ctarank` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait, NVVMRequiresSM<90>

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.cluster.nctaid.x` (NVVM::ClusterDimBlocksXOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.nctaid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait, NVVMRequiresSM<90>

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.cluster.nctaid.y` (NVVM::ClusterDimBlocksYOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.nctaid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait, NVVMRequiresSM<90>

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.cluster.nctaid.z` (NVVM::ClusterDimBlocksZOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.nctaid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.cluster.nctarank` (NVVM::ClusterDim) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.cluster.nctarank` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.clusterid.x` (NVVM::ClusterIdXOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.clusterid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait, NVVMRequiresSM<90>

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.clusterid.y` (NVVM::ClusterIdYOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.clusterid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.clusterid.z` (NVVM::ClusterIdZOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.clusterid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.ctaid.x` (NVVM::BlockIdXOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.ctaid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.ctaid.y` (NVVM::BlockIdYOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.ctaid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.ctaid.z` (NVVM::BlockIdZOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.ctaid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg0` (NVVM::EnvReg0Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg0` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg1` (NVVM::EnvReg1Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg1` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg10` (NVVM::EnvReg10Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg10` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg11` (NVVM::EnvReg11Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg11` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg12` (NVVM::EnvReg12Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg12` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg13` (NVVM::EnvReg13Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg13` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg14` (NVVM::EnvReg14Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg14` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg15` (NVVM::EnvReg15Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg15` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg16` (NVVM::EnvReg16Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg16` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg17` (NVVM::EnvReg17Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg17` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg18` (NVVM::EnvReg18Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg18` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg19` (NVVM::EnvReg19Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg19` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg2` (NVVM::EnvReg2Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg2` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg20` (NVVM::EnvReg20Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg20` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg21` (NVVM::EnvReg21Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg21` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg22` (NVVM::EnvReg22Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg22` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg23` (NVVM::EnvReg23Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg23` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg24` (NVVM::EnvReg24Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg24` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg25` (NVVM::EnvReg25Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg25` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg26` (NVVM::EnvReg26Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg26` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg27` (NVVM::EnvReg27Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg27` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg28` (NVVM::EnvReg28Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg28` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg29` (NVVM::EnvReg29Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg29` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg3` (NVVM::EnvReg3Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg3` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg30` (NVVM::EnvReg30Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg30` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg31` (NVVM::EnvReg31Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg31` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg4` (NVVM::EnvReg4Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg4` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg5` (NVVM::EnvReg5Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg5` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg6` (NVVM::EnvReg6Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg6` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg7` (NVVM::EnvReg7Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg7` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg8` (NVVM::EnvReg8Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg8` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.envreg9` (NVVM::EnvReg9Op) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.envreg9` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.globaltimer` (NVVM::GlobalTimerOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.globaltimer` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.gridid` (NVVM::GridIdOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.gridid` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.laneid` (NVVM::LaneIdOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.laneid` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.lanemask.eq` (NVVM::LaneMaskEqOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.lanemask.eq` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.lanemask.ge` (NVVM::LaneMaskGeOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.lanemask.ge` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.lanemask.gt` (NVVM::LaneMaskGtOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.lanemask.gt` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.lanemask.le` (NVVM::LaneMaskLeOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.lanemask.le` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.lanemask.lt` (NVVM::LaneMaskLtOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.lanemask.lt` attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.nclusterid.x` (NVVM::ClusterDimXOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.nclusterid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.nclusterid.y` (NVVM::ClusterDimYOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.nclusterid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.nclusterid.z` (NVVM::ClusterDimZOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.nclusterid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.nctaid.x` (NVVM::GridDimXOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.nctaid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.nctaid.y` (NVVM::GridDimYOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.nctaid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.nctaid.z` (NVVM::GridDimZOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.nctaid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.nsmid` (NVVM::SmDimOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.nsmid` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.ntid.x` (NVVM::BlockDimXOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.ntid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.ntid.y` (NVVM::BlockDimYOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.ntid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.ntid.z` (NVVM::BlockDimZOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.ntid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.nwarpid` (NVVM::WarpDimOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.nwarpid` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.smid` (NVVM::SmIdOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.smid` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.tid.x` (NVVM::ThreadIdXOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.tid.x` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.tid.y` (NVVM::ThreadIdYOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.tid.y` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.tid.z` (NVVM::ThreadIdZOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.tid.z` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.warpid` (NVVM::WarpIdOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.warpid` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.read.ptx.sreg.warpsize` (NVVM::WarpSizeOp) ¶

Syntax:

operation ::= `nvvm.read.ptx.sreg.warpsize` (`range` $range^)? attr-dict `:` type($res)

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferIntRangeInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute MLIR Type Description

range

::mlir::LLVM::ConstantRangeAttr

A range of two integers, corresponding to LLVM's ConstantRange

A pair of two integers, mapping to the ConstantRange structure in LLVM IR,
which is allowed to wrap or be empty.
The range represented is [Lower, Upper), and is either signed or unsigned
depending on context.
lower and upper must have the same width.
Syntax:
`&lt;` `i`(width($lower)) $lower `,` $upper `&gt;`

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.redux.sync` (NVVM::ReduxOp) ¶

Redux Sync Op

Syntax:

operation ::= `nvvm.redux.sync` $kind $val `,` $mask_and_clamp  attr-dict `:` type($val) `->` type($res)

redux.sync performs a reduction operation kind of the 32 bit source register across all non-exited threads in the membermask.

The abs and nan attributes can be used in the case of f32 input type, where the abs attribute causes the absolute value of the input to be used in the reduction operation, and the nan attribute causes the reduction operation to return NaN if any of the inputs to participating threads are NaN.

For more information, see PTX ISA

Traits: NVVMRequiresSM<80>

Attributes: ¶

Attribute	MLIR Type	Description
`kind`	::mlir::NVVM::ReduxKindAttr	NVVM redux kind
`abs`	::mlir::BoolAttr	bool attribute
`nan`	::mlir::BoolAttr	bool attribute

Operands: ¶

Operand	Description
`val`	LLVM dialect-compatible type
`mask_and_clamp`	32-bit signless integer

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.setmaxregister` (NVVM::SetMaxRegisterOp) ¶

Syntax:

operation ::= `nvvm.setmaxregister` $action $regCount attr-dict

Attributes: ¶

Attribute	MLIR Type	Description
`regCount`	::mlir::IntegerAttr	32-bit signless integer attribute
`action`	::mlir::NVVM::SetMaxRegisterActionAttr	NVVM set max register action

`nvvm.shfl.sync` (NVVM::ShflOp) ¶

NVVM Dialect Op for shfl.sync

Syntax:

operation ::= `nvvm.shfl.sync` $kind $thread_mask `,` $val `,` $offset `,` $mask_and_clamp  attr-dict
              `:` type($val) `->` type($res)

The shfl.sync Op implements data shuffle within threads of a warp. The thread_mask denotes the threads participating in the Op where the bit position corresponds to a particular thread’s laneid. The offset specifies a source lane or source lane offset (depending on kind). The val is the input value to be copied from the source. The mask_and_clamp contains two packed values specifying a mask for logically splitting warps into sub-segments and an upper bound for clamping the source lane index.

For more information, see PTX ISA

Traits: NVVMRequiresSM<30>

Attributes: ¶

Attribute	MLIR Type	Description
`kind`	::mlir::NVVM::ShflKindAttr	NVVM shuffle kind
`return_value_and_is_valid`	::mlir::UnitAttr	unit attribute

Operands: ¶

Operand	Description
`thread_mask`	32-bit signless integer
`val`	LLVM dialect-compatible type
`offset`	32-bit signless integer
`mask_and_clamp`	32-bit signless integer

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`nvvm.st.bulk` (NVVM::BulkStoreOp) ¶

Bulk Store Op

Syntax:

operation ::= `nvvm.st.bulk` $addr `,` `size` `=` $size (`,` `init` `=` $initVal^)? attr-dict `:` type($addr)

Initializes a region of shared memory at the address given by addr. The size operand specifies the number of bytes to initialize and must be a multiple of 8. The initVal operand specifies the value to initialize the memory to. The only supported value is 0.

For more information, see PTX ISA

Attributes: ¶

Attribute	MLIR Type	Description
`initVal`	::mlir::IntegerAttr	64-bit signless integer attribute

Operands: ¶

Operand	Description
`addr`	LLVM pointer in address space 0 or LLVM pointer in address space 3
`size`	64-bit signless integer

`nvvm.stmatrix` (NVVM::StMatrixOp) ¶

Cooperative matrix store

Syntax:

operation ::= `nvvm.stmatrix` $ptr `,` $sources attr-dict `:` type(operands)

Collectively store one or more matrices across all threads in a warp to the location indicated by the address operand $ptr in shared memory.

For more information, see PTX ISA

Interfaces: BasicPtxBuilderInterface

Attributes: ¶

Attribute	MLIR Type	Description
`layout`	::mlir::NVVM::MMALayoutAttr	NVVM MMA layout

Operands: ¶

Operand	Description
`ptr`	LLVM pointer in address space 3
`sources`	variadic of 32-bit signless integer

`nvvm.tcgen05.alloc` (NVVM::Tcgen05AllocOp) ¶

Tcgen05 alloc operation

Syntax:

operation ::= `nvvm.tcgen05.alloc` $addr `,` $nCols attr-dict `:` type(operands)

The tcgen05.alloc Op allocates tensor core memory for the amount specified by nCols and writes the destination address to the addr argument. The nCols operand specifies the number of columns to be allocated and it must be a power-of-two. For more information, see PTX ISA

Traits: NVVMRequiresSMa<100,101>

Attributes: ¶

Attribute	MLIR Type	Description
`group`	::mlir::NVVM::Tcgen05GroupKindAttr	NVVM Tcgen05 group kind

Operands: ¶

Operand	Description
`addr`	LLVM pointer type or LLVM pointer in address space 3
`nCols`	32-bit signless integer

`nvvm.tcgen05.commit` (NVVM::Tcgen05CommitOp) ¶

Tcgen05 commit operations

Syntax:

operation ::= `nvvm.tcgen05.commit` $addr (`,` `multicast_mask` `=` $multicastMask^)?
              attr-dict `:` type(operands)

The tcgen05.commit makes the mbarrier object, specified by the operand addr, track the completion of all the prior async-tcgen05 operations initiated by the executing thread. The multicast variants allow signaling on the mbarrier objects of multiple CTAs within the cluster. Operand multicastMask, when present, specifies the destination CTAs in the cluster such that each bit position in the 16-bit multicastMask operand corresponds to the nvvm.read.ptx.sreg.ctaid of the destination CTA. For more information, see PTX ISA

Traits: NVVMRequiresSMa<100,101>

Attributes: ¶

Attribute	MLIR Type	Description
`group`	::mlir::NVVM::Tcgen05GroupKindAttr	NVVM Tcgen05 group kind

Operands: ¶

Operand	Description
`addr`	LLVM pointer type or LLVM pointer in address space 3
`multicastMask`	16-bit signless integer

`nvvm.tcgen05.cp` (NVVM::Tcgen05CpOp) ¶

Tcgen05 copy operation

Syntax:

operation ::= `nvvm.tcgen05.cp` $taddr`,` $smem_desc attr-dict

Instruction tcgen05.cp initiates an asynchronous copy operation from shared memory to the location specified by the address operand taddr in the Tensor Memory. The 64-bit register operand smem_desc specifies the matrix descriptor representing the source matrix in the shared memory that needs to be copied.

Example:

  nvvm.tcgen05.cp %taddr, %smem_desc {
    group = #nvvm.tcgen05_group<cta_2>,
    shape = #nvvm.tcgen05_cp_shape<shape_64x128b>,
    multicast = #nvvm.tcgen05_cp_multicast<warpx2_01_23>,
    srcFormat = #nvvm.tcgen05_cp_src_fmt<b6x16_p32>
  }

For more information, see PTX ISA

Traits: NVVMRequiresSMa<100,101>

Attributes: ¶

Attribute	MLIR Type	Description
`shape`	::mlir::NVVM::Tcgen05CpShapeAttr	tcgen05 cp shapes
`group`	::mlir::NVVM::Tcgen05GroupKindAttr	NVVM Tcgen05 group kind
`multicast`	::mlir::NVVM::Tcgen05CpMulticastAttr	tcgen05 cp multicast
`srcFormat`	::mlir::NVVM::Tcgen05CpSrcFormatAttr	tcgen05 cp source format

Operands: ¶

Operand	Description
`taddr`	LLVM pointer in address space 6
`smem_desc`	64-bit signless integer

`nvvm.tcgen05.dealloc` (NVVM::Tcgen05DeallocOp) ¶

Tcgen05 dealloc operation

Syntax:

operation ::= `nvvm.tcgen05.dealloc` $taddr `,` $nCols attr-dict `:` type(operands)

The tcgen05.dealloc Op de-allocates the tensor core memory specified by tmemAddr, which must be from a previous tensor memory allocation. The nCols operand specifies the number of columns to be de-allocated, and it must be a power-of-two. For more information, see PTX ISA

Traits: NVVMRequiresSMa<100,101>

Attributes: ¶

Attribute	MLIR Type	Description
`group`	::mlir::NVVM::Tcgen05GroupKindAttr	NVVM Tcgen05 group kind

Operands: ¶

Operand	Description
`taddr`	LLVM pointer in address space 6
`nCols`	32-bit signless integer

`nvvm.tcgen05.fence` (NVVM::Tcgen05FenceOp) ¶

Tcgen05 fence operations

Syntax:

operation ::= `nvvm.tcgen05.fence` $kind attr-dict

The tcgen05.fence<before> orders all prior async tcgen05 operations with respect to the subsequent tcgen05 and execution ordering operations. The tcgen05.fence<after> orders all subsequent async tcgen05 operations with respect to the prior tcgen05 and execution ordering operations.

For more information, see PTX ISA

Traits: NVVMRequiresSMa<100,101>

Attributes: ¶

Attribute	MLIR Type	Description
`kind`	::mlir::NVVM::Tcgen05FenceKindAttr	NVVM Tcgen05 fence kind

`nvvm.tcgen05.ld` (NVVM::Tcgen05LdOp) ¶

Tensor memory load instructions

Syntax:

operation ::= `nvvm.tcgen05.ld` $tmemAddr (`,` $offset^)? (`pack` $pack^)? attr-dict `:` type($res)

Instruction tcgen05.ld asynchronously loads data from the Tensor Memory at the location specified by the 32-bit address operand tmemAddr into the destination register res, collectively across all threads of the warps.

The shape and the num attribute together determines the total dimension of the data which is loaded from the Tensor Memory. The shape attribute indicates the base dimension of data to be accessed as described in the Data Movement Shape. The num attribute indicates the repeat factor on the base dimension resulting in the total dimension of the data that is accessed.

The shape 16x32bx2 performs two accesses into Tensor Memory of the shape 16x32b. The base address of the first access is specified by tmemAddr and the base address of the second access is specified by tmemAddr + offset, where offset is an immediate argument.

The unit attribute pack can be used to pack two 16-bit elements from adjacent columns into a single 32-bit element during the load.

The following table describes the size of the vector for various combinations of num and shape attributes:

|=====================================================================|
| num/shape      |     16x32bx2/16x64b/32x32b |  16x128b   | 16x256b  |
|=====================================================================|
| x1             |          1                 |    2       |    4     |
| x2             |          2                 |    4       |    8     |
| x4             |          4                 |    8       |    16    |
| x8             |          8                 |    16      |    32    |
| x16            |          16                |    32      |    64    |
| x32            |          32                |    64      |    128   |
| x64            |          64                |    128     |    NA    |
| x128           |          128               |    NA      |    NA    |
|=====================================================================|

Example:

  nvvm.tcgen05.ld %tmemAddr, %offset pack {
    shape = #nvvm.tcgen05_ldst_shape<shape_16x32bx2>,
  } : <2xi32>

For more information, see PTX ISA

Traits: NVVMRequiresSMa<100,101>

Attributes: ¶

Attribute	MLIR Type	Description
`pack`	::mlir::UnitAttr	unit attribute
`shape`	::mlir::NVVM::Tcgen05LdStShapeAttr	allowed 32-bit signless integer cases: 0, 1, 2, 3, 4

Operands: ¶

Operand	Description
`tmemAddr`	LLVM pointer in address space 6
`offset`	64-bit signless integer

Results: ¶

Result	Description
`res`	32-bit signless integer or vector of 32-bit signless integer values of length 2/4/8/16/32/64/128

`nvvm.tcgen05.mma_smem_desc` (NVVM::Tcgen05MmaSmemDescOp) ¶

Constructs a Shared Memory descriptor for MMA Operands A or B

Syntax:

operation ::= `nvvm.tcgen05.mma_smem_desc` `(` operands `)` attr-dict `:` `(` type(operands) `)` `->` type($res)

The nvvm.tcgen05_mma_smem_desc constructs a Shared Memory descriptor for tcgen05.mma. This descriptor is a 64-bit value which describes the properties of multiplicand matrix in shared memory including its location in the shared memory of the current CTA.

+-----------+------+------------------------------------------------------+
| Bit-field | Size | Description                                          |
+-----------+------+------------------------------------------------------+
| 0-13      | 14   | Matrix start address                                 |
| 14-15     | 2    | Reserved                                             |
| 16-29     | 14   | Leading dim relative-offset (or) absolute-address    |
| 30-31     | 2    | Reserved                                             |
| 32-45     | 14   | Stride dimension byte offset                         |
| 46-48     | 3    | Fixed constant value of 0b001                        |
| 49-51     | 3    | Matrix base offset                                   |
| 52        | 1    | Leading dimension stride mode:                       |
|           |      |   0: byte offset relative                            |
|           |      |   1: byte address absolute                           |
| 53-60     | 8    | Fixed constant value of 0xb00000000                  |
| 61-63     | 3    | Swizzling mode:                                      |
|           |      |   0: No swizzling                                    |
|           |      |   1: 128-Byte with 32B atomic swizzling              |
|           |      |   2: 128-Byte swizzling                              |
|           |      |   4: 64-Byte swizzling                               |
|           |      |   6: 32-Byte swizzling                               |
|           |      |   (Values 3, 5 and 7 are invalid)                    |
+-----------+------+------------------------------------------------------+

Example:

  %desc = nvvm.tcgen05.mma_smem_desc (%startAddr, %leadingDimOffset, %strideDimOffset,
                                      %baseOffset, %leadingDimMode, %swizzleMode) : (i32, i32, i32, i8, i1, i8) -> i64

For more information, see PTX ISA

Operands: ¶

Operand	Description
`startAddr`	32-bit signless integer
`leadingDimOffset`	32-bit signless integer
`strideDimOffset`	32-bit signless integer
`baseOffset`	8-bit signless integer
`leadingDimMode`	1-bit signless integer
`swizzleMode`	8-bit signless integer

Results: ¶

Result	Description
`res`	64-bit signless integer

`nvvm.tcgen05.relinquish_alloc_permit` (NVVM::Tcgen05RelinquishAllocPermitOp) ¶

Tcgen05 Op to relinquish the right to allocate

Syntax:

operation ::= `nvvm.tcgen05.relinquish_alloc_permit` attr-dict

The tcgen05.relinquish_alloc_permit Op specifies that the CTA of the executing thread is relinquishing the right to allocate Tensor Memory. So, it is illegal for a CTA to perform tcgen05.alloc after any of its constituent threads execute tcgen05.relinquish_alloc_permit. For more information, see PTX ISA

Traits: NVVMRequiresSMa<100,101>

Attributes: ¶

Attribute	MLIR Type	Description
`group`	::mlir::NVVM::Tcgen05GroupKindAttr	NVVM Tcgen05 group kind

`nvvm.tcgen05.shift` (NVVM::Tcgen05ShiftOp) ¶

Tcgen05 shift operation

Syntax:

operation ::= `nvvm.tcgen05.shift` $taddr attr-dict `:` type(operands)

The tcgen05.shift is an asynchronous instruction which initiates the shifting of 32-byte elements downwards across all the rows, except the last, by one row. The operand taddr specifies the base address of the matrix in Tensor Memory whose rows must be down shifted.

For more information, see PTX ISA

Traits: NVVMRequiresSMa<100,101,103>

Attributes: ¶

Attribute	MLIR Type	Description
`group`	::mlir::NVVM::Tcgen05GroupKindAttr	NVVM Tcgen05 group kind

Operands: ¶

Operand	Description
`taddr`	LLVM pointer in address space 6

`nvvm.tcgen05.st` (NVVM::Tcgen05StOp) ¶

Tensor memory store instructions

Syntax:

operation ::= `nvvm.tcgen05.st` $tmemAddr `,` $val (`,` $offset^)? (`unpack` $unpack^)? attr-dict `:` type($val)

Instruction tcgen05.st asynchronously stores data from the source register r into the Tensor Memory at the location specified by the 32-bit address operand tmemAddr, collectively across all threads of the warps.

The shape and the num attribute together determines the total dimension of the data which is stored to the Tensor Memory. The shape indicates the base dimension of data to be accessed. The num attribute indicates the repeat factor on the base dimension resulting in the total dimension of the data that is accessed.

The unit attribute unpack can be used to unpack a 32-bit element in the register into two 16-bit elements and store them in adjacent columns.

The following table describes the size of the vector for various combinations of num and shape attributes:

|=====================================================================|
| num/shape      |     16x32bx2/16x64b/32x32b |  16x128b   | 16x256b  |
|=====================================================================|
| x1             |          1                 |    2       |    4     |
| x2             |          2                 |    4       |    8     |
| x4             |          4                 |    8       |    16    |
| x8             |          8                 |    16      |    32    |
| x16            |          16                |    32      |    64    |
| x32            |          32                |    64      |    128   |
| x64            |          64                |    128     |    NA    |
| x128           |          128               |    NA      |    NA    |
|=====================================================================|

Example:

  nvvm.tcgen05.st %tmemAddr, %val, %offset unpack {
    shape = #nvvm.tcgen05_ldst_shape<shape_16x32bx2>,
  } : <2xi32>

For more information, see PTX ISA

Traits: NVVMRequiresSMa<100,101>

Attributes: ¶

Attribute	MLIR Type	Description
`unpack`	::mlir::UnitAttr	unit attribute
`shape`	::mlir::NVVM::Tcgen05LdStShapeAttr	allowed 32-bit signless integer cases: 0, 1, 2, 3, 4

Operands: ¶

Operand	Description
`tmemAddr`	LLVM pointer in address space 6
`val`	32-bit signless integer or vector of 32-bit signless integer values of length 2/4/8/16/32/64/128
`offset`	64-bit signless integer

`nvvm.tcgen05.wait` (NVVM::Tcgen05WaitOp) ¶

Tcgen05 wait operations

Syntax:

operation ::= `nvvm.tcgen05.wait` $kind attr-dict

The tcgen05.wait<load> causes the executing thread to block until all prior tcgen05.ld operations issued by the executing thread have completed. Similarly, the tcgen05.wait<store> causes the executing thread to block until all prior tcgen05.st operations issued by the executing thread have completed. For more information, see PTX ISA

Traits: NVVMRequiresSMa<100,101>

Attributes: ¶

Attribute	MLIR Type	Description
`kind`	::mlir::NVVM::Tcgen05WaitKindAttr	NVVM Tcgen05 wait kind

`nvvm.vote.sync` (NVVM::VoteSyncOp) ¶

Vote across thread group

Syntax:

operation ::= `nvvm.vote.sync` $kind $mask `,` $pred attr-dict `->` type($res)

The vote.sync op will cause executing thread to wait until all non-exited threads corresponding to membermask have executed vote.sync with the same qualifiers and same membermask value before resuming execution.

The vote operation kinds are:

any: True if source predicate is True for some thread in membermask.
all: True if source predicate is True for all non-exited threads in membermask.
uni: True if source predicate has the same value in all non-exited threads in membermask.
ballot: In the ballot form, the destination result is a 32 bit integer. In this form, the predicate from each thread in membermask are copied into the corresponding bit position of the result, where the bit position corresponds to the thread’s lane id.

For more information, see PTX ISA

Attributes: ¶

Attribute	MLIR Type	Description
`kind`	::mlir::NVVM::VoteSyncKindAttr	NVVM vote sync kind

Operands: ¶

Operand	Description
`mask`	32-bit signless integer
`pred`	1-bit signless integer

Results: ¶

Result	Description
`res`	32-bit signless integer or 1-bit signless integer

`nvvm.wgmma.commit.group.sync.aligned` (NVVM::WgmmaGroupSyncAlignedOp) ¶

Syntax:

operation ::= `nvvm.wgmma.commit.group.sync.aligned` attr-dict

Commits all prior uncommitted warpgroup level matrix multiplication operations.

For more information, see PTX ISA

Traits: NVVMRequiresSMa<90>

`nvvm.wgmma.fence.aligned` (NVVM::WgmmaFenceAlignedOp) ¶

Syntax:

operation ::= `nvvm.wgmma.fence.aligned` attr-dict

Enforce an ordering of register accesses between warpgroup level matrix multiplication and other operations.

For more information, see PTX ISA

Traits: NVVMRequiresSMa<90>

`nvvm.wgmma.mma_async` (NVVM::WgmmaMmaAsyncOp) ¶

Syntax:

operation ::= `nvvm.wgmma.mma_async` $descriptorA `,` $descriptorB `,` $inouts `,` $shape `,`
              `D` `[` $typeD `,` $scaleD (`,` $satfinite^)? `]` `,`
              `A` `[` $typeA `,` $scaleA `,` $layoutA `]` `,`
              `B` `[` $typeB `,` $scaleB `,` $layoutB `]`
              attr-dict `:`
              type($inouts) `->` type($results)

The warpgroup (128 threads) level matrix multiply and accumulate operation has either of the following forms, where matrix D is called accumulator: D = A * B + D D = A * B, where the input from accumulator D is disabled.

Supported shapes:

|--------------|--------------|------------|--------------|---------------|
|              |              |            |              |f16+=e4m3*e4m3 |
|              |              |            |              |f16+=e5m2*e5m2 |
|f32+=tf32*tf32|f16+=f16 *f16 | s32+=s8*s8 |s32 += b1 * b1|f16+=e5m2*e4m3 |
|              |f32+=f16 *f16 | s32+=u8*u8 |              |f16+=e4m3*e5m2 |
|              |f32+=bf16*bf16| s32+=u8*u8 |              |f16+=e4m3*e5m2 |
|              |f32+=bf16*bf16| s32+=s8*u8 |              |f32+=e4m3*e4m3 |
|              |              | s32+=u8*s8 |              |f32+=e5m2*e5m2 |
|              |              |            |              |f32+=e4m3*e5m2 |
|              |              |            |              |f32+=e4m3*e5m2 |
|--------------|--------------|------------|--------------|---------------|
|   .m64n8k8   |  .m64n8k16   | .m64n8k32  | .m64n8k256   | .m64n8k32     |
|   .m64n16k8  |  .m64n16k16  | .m64n16k32 | .m64n16k256  | .m64n16k32    |
|   .m64n24k8  |  .m64n24k16  | .m64n24k32 | .m64n24k256  | .m64n24k32    |
|   .m64n32k8  |  .m64n32k16  | .m64n32k32 | .m64n32k256  | .m64n32k32    |
|   .m64n40k8  |  .m64n40k16  | .m64n48k32 | .m64n48k256  | .m64n40k32    |
|   .m64n48k8  |  .m64n48k16  | .m64n64k32 | .m64n64k256  | .m64n48k32    |
|   .m64n56k8  |  .m64n56k16  | .m64n80k32 | .m64n80k256  | .m64n56k32    |
|   .m64n64k8  |  .m64n64k16  | .m64n96k32 | .m64n96k256  | .m64n64k32    |
|   .m64n72k8  |  .m64n72k16  | .m64n112k32| .m64n112k256 | .m64n72k32    |
|   .m64n80k8  |  .m64n80k16  | .m64n128k32| .m64n128k256 | .m64n80k32    |
|   .m64n88k8  |  .m64n88k16  | .m64n144k32| .m64n144k256 | .m64n88k32    |
|   .m64n96k8  |  .m64n96k16  | .m64n160k32| .m64n160k256 | .m64n96k32    |
|   .m64n104k8 |  .m64n104k16 | .m64n176k32| .m64n176k256 | .m64n104k32   |
|   .m64n112k8 |  .m64n112k16 | .m64n192k32| .m64n192k256 | .m64n112k32   |
|   .m64n120k8 |  .m64n120k16 | .m64n208k32| .m64n208k256 | .m64n120k32   |
|   .m64n128k8 |  .m64n128k16 | .m64n224k32| .m64n224k256 | .m64n128k32   |
|   .m64n136k8 |  .m64n136k16 | .m64n240k32| .m64n240k256 | .m64n136k32   |
|   .m64n144k8 |  .m64n144k16 | .m64n256k32| .m64n256k256 | .m64n144k32   |
|   .m64n152k8 |  .m64n152k16 |            |              | .m64n152k32   |
|   .m64n160k8 |  .m64n160k16 |            |              | .m64n160k32   |
|   .m64n168k8 |  .m64n168k16 |            |              | .m64n168k32   |
|   .m64n176k8 |  .m64n176k16 |            |              | .m64n176k32   |
|   .m64n184k8 |  .m64n184k16 |            |              | .m64n184k32   |
|   .m64n192k8 |  .m64n192k16 |            |              | .m64n192k32   |
|   .m64n200k8 |  .m64n200k16 |            |              | .m64n200k32   |
|   .m64n208k8 |  .m64n208k16 |            |              | .m64n208k32   |
|   .m64n216k8 |  .m64n216k16 |            |              | .m64n216k32   |
|   .m64n224k8 |  .m64n224k16 |            |              | .m64n224k32   |
|   .m64n232k8 |  .m64n232k16 |            |              | .m64n232k32   |
|   .m64n240k8 |  .m64n240k16 |            |              | .m64n240k32   |
|   .m64n248k8 |  .m64n248k16 |            |              | .m64n248k32   |
|   .m64n256k8 |  .m64n256k16 |            |              | .m64n256k32   |
|--------------|--------------|------------|--------------|---------------|

For more information, see PTX ISA

Interfaces: BasicPtxBuilderInterface

Attributes: ¶

Attribute	MLIR Type	Description
`shape`	::mlir::NVVM::MMAShapeAttr	Attribute for MMA operation shape.
`typeA`	::mlir::NVVM::WGMMATypesAttr	NVVM WGMMA types
`typeB`	::mlir::NVVM::WGMMATypesAttr	NVVM WGMMA types
`typeD`	::mlir::NVVM::WGMMATypesAttr	NVVM WGMMA types
`scaleD`	::mlir::NVVM::WGMMAScaleOutAttr	WGMMA input predicate
`scaleA`	::mlir::NVVM::WGMMAScaleInAttr	WGMMA overflow options
`scaleB`	::mlir::NVVM::WGMMAScaleInAttr	WGMMA overflow options
`layoutA`	::mlir::NVVM::MMALayoutAttr	NVVM MMA layout
`layoutB`	::mlir::NVVM::MMALayoutAttr	NVVM MMA layout
`satfinite`	::mlir::NVVM::MMAIntOverflowAttr	MMA overflow options

Operands: ¶

Operand	Description
`inouts`	LLVM structure type
`descriptorA`	64-bit signless integer
`descriptorB`	64-bit signless integer

Results: ¶

Result	Description
`results`	LLVM structure type

`nvvm.wgmma.wait.group.sync.aligned` (NVVM::WgmmaWaitGroupSyncOp) ¶

Syntax:

operation ::= `nvvm.wgmma.wait.group.sync.aligned` attr-dict $group

Signal the completion of a preceding warpgroup operation.

For more information, see PTX ISA

Traits: NVVMRequiresSMa<90>

Attributes: ¶

Attribute	MLIR Type	Description
`group`	::mlir::IntegerAttr	64-bit signless integer attribute

`nvvm.wmma.load` (NVVM::WMMALoadOp) ¶

Warp synchronous matrix load

Syntax:

operation ::= `nvvm.wmma.load` $ptr `,` $stride attr-dict `:` functional-type($ptr, $res)

Attributes: ¶

Attribute	MLIR Type	Description
`m`	::mlir::IntegerAttr	32-bit signless integer attribute
`n`	::mlir::IntegerAttr	32-bit signless integer attribute
`k`	::mlir::IntegerAttr	32-bit signless integer attribute
`layout`	::mlir::NVVM::MMALayoutAttr	NVVM MMA layout
`eltype`	::mlir::NVVM::MMATypesAttr	NVVM MMA types
`frag`	::mlir::NVVM::MMAFragAttr	NVVM MMA frag type

Operands: ¶

Operand	Description
`ptr`	LLVM pointer type
`stride`	32-bit signless integer

Results: ¶

Result	Description
`res`	LLVM structure type

`nvvm.wmma.mma` (NVVM::WMMAMmaOp) ¶

Warp synchronous matrix-multiply accumulate using tensor cores.

Syntax:

operation ::= `nvvm.wmma.mma` $args attr-dict `:` functional-type($args, $res)

Attributes: ¶

Attribute	MLIR Type	Description
`m`	::mlir::IntegerAttr	32-bit signless integer attribute
`n`	::mlir::IntegerAttr	32-bit signless integer attribute
`k`	::mlir::IntegerAttr	32-bit signless integer attribute
`layoutA`	::mlir::NVVM::MMALayoutAttr	NVVM MMA layout
`layoutB`	::mlir::NVVM::MMALayoutAttr	NVVM MMA layout
`eltypeA`	::mlir::NVVM::MMATypesAttr	NVVM MMA types
`eltypeB`	::mlir::NVVM::MMATypesAttr	NVVM MMA types

Operands: ¶

Operand	Description
`args`	variadic of LLVM dialect-compatible type

Results: ¶

Result	Description
`res`	LLVM structure type

`nvvm.wmma.store` (NVVM::WMMAStoreOp) ¶

Warp synchronous matrix store

Syntax:

operation ::= `nvvm.wmma.store` $ptr `,` $stride `,` $args attr-dict `:` qualified(type($ptr)) `,`
              type($args)

Attributes: ¶

Attribute	MLIR Type	Description
`m`	::mlir::IntegerAttr	32-bit signless integer attribute
`n`	::mlir::IntegerAttr	32-bit signless integer attribute
`k`	::mlir::IntegerAttr	32-bit signless integer attribute
`layout`	::mlir::NVVM::MMALayoutAttr	NVVM MMA layout
`eltype`	::mlir::NVVM::MMATypesAttr	NVVM MMA types

Operands: ¶

Operand	Description
`ptr`	LLVM pointer type
`args`	variadic of LLVM dialect-compatible type
`stride`	32-bit signless integer

Attributes ¶

CacheEvictionPriorityAttr ¶

NVVM Cache Eviction Priority

Syntax:

#nvvm.cache_eviction_priority<
  ::mlir::NVVM::CacheEvictionPriority   # value
>

Enum cases:

evict_normal (EvictNormal)
evict_first (EvictFirst)
evict_last (EvictLast)
evict_unchanged (EvictUnchanged)
no_allocate (NoAllocate)

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::CacheEvictionPriority`	an enum of type CacheEvictionPriority

ConvertFP6TypeAttr ¶

NVVM ConvertFP6Type kind

Syntax:

#nvvm.convert_fp6_type<
  ::mlir::NVVM::ConvertFP6Type   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::ConvertFP6Type`	an enum of type ConvertFP6Type

ConvertFP8TypeAttr ¶

NVVM ConvertFP8Type kind

Syntax:

#nvvm.convert_fp8_type<
  ::mlir::NVVM::ConvertFP8Type   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::ConvertFP8Type`	an enum of type ConvertFP8Type

DotAccumulateTypeAttr ¶

NVVM DotAccumulateType

Syntax:

#nvvm.dot_accumulate_type<
  ::mlir::NVVM::DotAccumulateType   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::DotAccumulateType`	an enum of type DotAccumulateType

FPRoundingModeAttr ¶

NVVM FPRoundingMode kind

Syntax:

#nvvm.fp_rnd_mode<
  ::mlir::NVVM::FPRoundingMode   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::FPRoundingMode`	an enum of type FPRoundingMode

LoadCacheModifierKindAttr ¶

NVVM load cache modifier kind

Syntax:

#nvvm.load_cache_modifier<
  ::mlir::NVVM::LoadCacheModifierKind   # value
>

Enum attribute of the different kinds of cache operators for load instructions.

For more information, see PTX ISA

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::LoadCacheModifierKind`	an enum of type LoadCacheModifierKind

MMAB1OpAttr ¶

MMA binary operations

Syntax:

#nvvm.mma_b1op<
  ::mlir::NVVM::MMAB1Op   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::MMAB1Op`	an enum of type MMAB1Op

MMAFragAttr ¶

NVVM MMA frag type

Syntax:

#nvvm.mma_frag<
  ::mlir::NVVM::MMAFrag   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::MMAFrag`	an enum of type MMAFrag

MMAIntOverflowAttr ¶

MMA overflow options

Syntax:

#nvvm.mma_int_overflow<
  ::mlir::NVVM::MMAIntOverflow   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::MMAIntOverflow`	an enum of type MMAIntOverflow

MMALayoutAttr ¶

NVVM MMA layout

Syntax:

#nvvm.mma_layout<
  ::mlir::NVVM::MMALayout   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::MMALayout`	an enum of type MMALayout

MMATypesAttr ¶

NVVM MMA types

Syntax:

#nvvm.mma_type<
  ::mlir::NVVM::MMATypes   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::MMATypes`	an enum of type MMATypes

MatchSyncKindAttr ¶

NVVM match sync kind

Syntax:

#nvvm.match_sync_kind<
  ::mlir::NVVM::MatchSyncKind   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::MatchSyncKind`	an enum of type MatchSyncKind

MemScopeKindAttr ¶

NVVM Memory Scope kind

Syntax:

#nvvm.mem_scope<
  ::mlir::NVVM::MemScopeKind   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::MemScopeKind`	an enum of type MemScopeKind

MMAShapeAttr ¶

Attribute for MMA operation shape.

Syntax:

#nvvm.shape<
  int,   # m
  int,   # n
  int   # k
>

Parameters: ¶

Parameter	C++ type	Description
m	`int`
n	`int`
k	`int`

NVVMTargetAttr ¶

Syntax:

#nvvm.target<
  int,   # O
  ::llvm::StringRef,   # triple
  ::llvm::StringRef,   # chip
  ::llvm::StringRef,   # features
  DictionaryAttr,   # flags
  ArrayAttr,   # link
  bool   # verifyTarget
>

GPU target attribute for controlling compilation of NVIDIA targets. All parameters decay into default values if not present.

Examples:

Target with default values.

  gpu.module @mymodule [#nvvm.target] attributes {...} {
    ...
  }

Target with sm_90 chip and fast math.

  gpu.module @mymodule [#nvvm.target<chip = "sm_90", flags = {fast}>] {
    ...
  }

Parameters: ¶

Parameter	C++ type	Description
O	`int`	Optimization level to apply.
triple	`::llvm::StringRef`	Target triple.
chip	`::llvm::StringRef`	Target chip.
features	`::llvm::StringRef`	Target chip features.
flags	`DictionaryAttr`	Target specific flags.
link	`ArrayAttr`	Files to link to the LLVM module.
verifyTarget	`bool`	Perform SM version check on Ops.

PrefetchCacheLevelAttr ¶

NVVM Prefetch Cache Level

Syntax:

#nvvm.prefetch_cache_level<
  ::mlir::NVVM::PrefetchCacheLevel   # value
>

Enum cases:

L1 (L1)
L2 (L2)

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::PrefetchCacheLevel`	an enum of type PrefetchCacheLevel

ProxyKindAttr ¶

Proxy kind

Syntax:

#nvvm.proxy_kind<
  ::mlir::NVVM::ProxyKind   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::ProxyKind`	an enum of type ProxyKind

ReduxKindAttr ¶

NVVM redux kind

Syntax:

#nvvm.redux_kind<
  ::mlir::NVVM::ReduxKind   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::ReduxKind`	an enum of type ReduxKind

SaturationModeAttr ¶

NVVM SaturationMode kind

Syntax:

#nvvm.sat_mode<
  ::mlir::NVVM::SaturationMode   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::SaturationMode`	an enum of type SaturationMode

SetMaxRegisterActionAttr ¶

NVVM set max register action

Syntax:

#nvvm.action<
  ::mlir::NVVM::SetMaxRegisterAction   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::SetMaxRegisterAction`	an enum of type SetMaxRegisterAction

SharedSpaceAttr ¶

Shared memory space

Syntax:

#nvvm.shared_space<
  ::mlir::NVVM::SharedSpace   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::SharedSpace`	an enum of type SharedSpace

ShflKindAttr ¶

NVVM shuffle kind

Syntax:

#nvvm.shfl_kind<
  ::mlir::NVVM::ShflKind   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::ShflKind`	an enum of type ShflKind

TMAReduxKindAttr ¶

NVVM TMA redux kind

Syntax:

#nvvm.tma_redux_kind<
  ::mlir::NVVM::TMAReduxKind   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::TMAReduxKind`	an enum of type TMAReduxKind

TMAStoreModeAttr ¶

NVVM TMA Store Mode

Syntax:

#nvvm.tma_store_mode<
  ::mlir::NVVM::TMAStoreMode   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::TMAStoreMode`	an enum of type TMAStoreMode

Tcgen05CpMulticastAttr ¶

Tcgen05 cp multicast

Syntax:

#nvvm.tcgen05_cp_multicast<
  ::mlir::NVVM::Tcgen05CpMulticast   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::Tcgen05CpMulticast`	an enum of type Tcgen05CpMulticast

Tcgen05CpShapeAttr ¶

Tcgen05 cp shapes

Syntax:

#nvvm.tcgen05_cp_shape<
  ::mlir::NVVM::Tcgen05CpShape   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::Tcgen05CpShape`	an enum of type Tcgen05CpShape

Tcgen05CpSrcFormatAttr ¶

Tcgen05 cp source format

Syntax:

#nvvm.tcgen05_cp_src_fmt<
  ::mlir::NVVM::Tcgen05CpSrcFormat   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::Tcgen05CpSrcFormat`	an enum of type Tcgen05CpSrcFormat

Tcgen05FenceKindAttr ¶

NVVM Tcgen05 fence kind

Syntax:

#nvvm.tcgen05_fence<
  ::mlir::NVVM::Tcgen05FenceKind   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::Tcgen05FenceKind`	an enum of type Tcgen05FenceKind

Tcgen05GroupKindAttr ¶

NVVM Tcgen05 group kind

Syntax:

#nvvm.tcgen05_group<
  ::mlir::NVVM::Tcgen05GroupKind   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::Tcgen05GroupKind`	an enum of type Tcgen05GroupKind

Tcgen05LdStShapeAttr ¶

Allowed 32-bit signless integer cases: 0, 1, 2, 3, 4

Syntax:

#nvvm.tcgen05_ldst_shape<
  ::mlir::NVVM::Tcgen05LdStShape   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::Tcgen05LdStShape`	an enum of type Tcgen05LdStShape

Tcgen05WaitKindAttr ¶

NVVM Tcgen05 wait kind

Syntax:

#nvvm.tcgen05_wait<
  ::mlir::NVVM::Tcgen05WaitKind   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::Tcgen05WaitKind`	an enum of type Tcgen05WaitKind

VoteSyncKindAttr ¶

NVVM vote sync kind

Syntax:

#nvvm.vote_sync_kind<
  ::mlir::NVVM::VoteSyncKind   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::VoteSyncKind`	an enum of type VoteSyncKind

WGMMAScaleInAttr ¶

WGMMA overflow options

Syntax:

#nvvm.wgmma_scale_in<
  ::mlir::NVVM::WGMMAScaleIn   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::WGMMAScaleIn`	an enum of type WGMMAScaleIn

WGMMAScaleOutAttr ¶

WGMMA input predicate

Syntax:

#nvvm.wgmma_scale_out<
  ::mlir::NVVM::WGMMAScaleOut   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::WGMMAScaleOut`	an enum of type WGMMAScaleOut

WGMMATypesAttr ¶

NVVM WGMMA types

Syntax:

#nvvm.wgmma_type<
  ::mlir::NVVM::WGMMATypes   # value
>

Parameters: ¶

Parameter	C++ type	Description
value	`::mlir::NVVM::WGMMATypes`	an enum of type WGMMATypes

Enums ¶

CacheEvictionPriority ¶

NVVM Cache Eviction Priority

Cases: ¶

Symbol	Value	String
EvictNormal	`0`	evict_normal
EvictFirst	`1`	evict_first
EvictLast	`2`	evict_last
EvictUnchanged	`3`	evict_unchanged
NoAllocate	`4`	no_allocate

ConvertFP6Type ¶

NVVM ConvertFP6Type kind

Cases: ¶

Symbol	Value	String
E2M3	`0`	e2m3
E3M2	`1`	e3m2

ConvertFP8Type ¶

NVVM ConvertFP8Type kind

Cases: ¶

Symbol	Value	String
E4M3	`0`	e4m3
E5M2	`1`	e5m2
UE8M0	`2`	ue8m0

DotAccumulateType ¶

NVVM DotAccumulateType

Cases: ¶

Symbol	Value	String
SIGNED	`1`	signed
UNSIGNED	`0`	unsigned

FPRoundingMode ¶

NVVM FPRoundingMode kind

Cases: ¶

Symbol	Value	String
NONE	`0`	none
RN	`1`	rn
RM	`2`	rm
RP	`3`	rp
RZ	`4`	rz
RNA	`5`	rna

LoadCacheModifierKind ¶

NVVM load cache modifier kind

Cases: ¶

Symbol	Value	String
CA	`0`	ca
CG	`1`	cg
CS	`2`	cs
LU	`3`	lu
CV	`4`	cv

MMAB1Op ¶

MMA binary operations

Cases: ¶

Symbol	Value	String
none	`0`	none
xor_popc	`1`	xor_popc
and_popc	`2`	and_popc

MMAFrag ¶

NVVM MMA frag type

Cases: ¶

Symbol	Value	String
a	`0`	a
b	`1`	b
c	`2`	c

MMAIntOverflow ¶

MMA overflow options

Cases: ¶

Symbol	Value	String
satfinite	`1`	satfinite
wrapped	`0`	wrapped

MMALayout ¶

NVVM MMA layout

Cases: ¶

Symbol	Value	String
row	`0`	row
col	`1`	col

MMATypes ¶

NVVM MMA types

Cases: ¶

Symbol	Value	String
f16	`0`	f16
f32	`1`	f32
tf32	`2`	tf32
bf16	`9`	bf16
s8	`4`	s8
u8	`3`	u8
s32	`5`	s32
s4	`8`	s4
u4	`7`	u4
b1	`6`	b1
f64	`10`	f64

MatchSyncKind ¶

NVVM match sync kind

Cases: ¶

Symbol	Value	String
any	`0`	any
all	`1`	all

MemScopeKind ¶

NVVM Memory Scope kind

Cases: ¶

Symbol	Value	String
CTA	`0`	cta
CLUSTER	`1`	cluster
GPU	`2`	gpu
SYS	`3`	sys

PrefetchCacheLevel ¶

NVVM Prefetch Cache Level

Cases: ¶

Symbol	Value	String
L1	`0`	L1
L2	`1`	L2

ProxyKind ¶

Proxy kind

Cases: ¶

Symbol	Value	String
alias	`0`	alias
async	`1`	async
async_global	`2`	async.global
async_shared	`3`	async.shared
TENSORMAP	`4`	tensormap
GENERIC	`5`	generic

ReduxKind ¶

NVVM redux kind

Cases: ¶

Symbol	Value	String
ADD	`1`	add
AND	`2`	and
MAX	`3`	max
MIN	`4`	min
OR	`5`	or
UMAX	`6`	umax
UMIN	`7`	umin
XOR	`8`	xor
FMIN	`9`	fmin
FMAX	`10`	fmax

SaturationMode ¶

NVVM SaturationMode kind

Cases: ¶

Symbol	Value	String
NONE	`0`	none
SATFINITE	`1`	satfinite

SetMaxRegisterAction ¶

NVVM set max register action

Cases: ¶

Symbol	Value	String
decrease	`1`	decrease
increase	`0`	increase

SharedSpace ¶

Shared memory space

Cases: ¶

Symbol	Value	String
shared_cta	`0`	cta
shared_cluster	`1`	cluster

ShflKind ¶

NVVM shuffle kind

Cases: ¶

Symbol	Value	String
bfly	`0`	bfly
up	`1`	up
down	`2`	down
idx	`3`	idx

TMAReduxKind ¶

NVVM TMA redux kind

Cases: ¶

Symbol	Value	String
ADD	`0`	add
MAX	`2`	max
MIN	`1`	min
INC	`3`	inc
DEC	`4`	dec
AND	`5`	and
OR	`6`	or
XOR	`7`	xor

TMAStoreMode ¶

NVVM TMA Store Mode

Cases: ¶

Symbol	Value	String
TILE	`0`	tile
IM2COL	`1`	im2col

Tcgen05CpMulticast ¶

Tcgen05 cp multicast

Cases: ¶

Symbol	Value	String
NONE	`0`	none
WARPX2_02_13	`1`	warpx2_02_13
WARPX2_01_23	`2`	warpx2_01_23
WARPX4	`3`	warpx4

Tcgen05CpShape ¶

Tcgen05 cp shapes

Cases: ¶

Symbol	Value	String
SHAPE_128x256b	`0`	shape_128x256b
SHAPE_4x256b	`1`	shape_4x256b
SHAPE_128x128b	`2`	shape_128x128b
SHAPE_64x128b	`3`	shape_64x128b
SHAPE_32x128b	`4`	shape_32x128b

Tcgen05CpSrcFormat ¶

Tcgen05 cp source format

Cases: ¶

Symbol	Value	String
B6x16_P32	`0`	b6x16_p32
B4x16_P64	`1`	b4x16_p64

Tcgen05FenceKind ¶

NVVM Tcgen05 fence kind

Cases: ¶

Symbol	Value	String
BEFORE_THREAD_SYNC	`0`	before
AFTER_THREAD_SYNC	`1`	after

Tcgen05GroupKind ¶

NVVM Tcgen05 group kind

Cases: ¶

Symbol	Value	String
CTA_1	`0`	cta_1
CTA_2	`1`	cta_2

Tcgen05LdStShape ¶

Allowed 32-bit signless integer cases: 0, 1, 2, 3, 4

Cases: ¶

Symbol	Value	String
SHAPE_16X64B	`0`	shape_16x64b
SHAPE_16X128B	`1`	shape_16x128b
SHAPE_16X256B	`2`	shape_16x256b
SHAPE_32X32B	`3`	shape_32x32b
SHAPE_16X32BX2	`4`	shape_16x32bx2

Tcgen05WaitKind ¶

NVVM Tcgen05 wait kind

Cases: ¶

Symbol	Value	String
LOAD	`0`	load
STORE	`1`	store

VoteSyncKind ¶

NVVM vote sync kind

Cases: ¶

Symbol	Value	String
any	`0`	any
all	`1`	all
ballot	`2`	ballot
uni	`3`	uni

WGMMAScaleIn ¶

WGMMA overflow options

Cases: ¶

Symbol	Value	String
one	`1`	one
neg	`-1`	neg

WGMMAScaleOut ¶

WGMMA input predicate

Cases: ¶

Symbol	Value	String
zero	`0`	zero
one	`1`	one

WGMMATypes ¶

NVVM WGMMA types

Cases: ¶

Symbol	Value	String
f16	`0`	f16
tf32	`1`	tf32
u8	`2`	u8
s8	`3`	s8
b1	`4`	b1
bf16	`5`	bf16
e4m3	`6`	e4m3
e5m2	`7`	e5m2
f32	`8`	f32
s32	`9`	s32

Symbol	Value	String
ADD	`0`	add
MAX	`2`	max
MIN	`1`	min
INC	`3`	inc
DEC	`4`	dec
AND	`5`	and
OR	`6`	or
XOR	`7`	xor

Symbol	Value	String
ADD	`0`	add
MAX	`2`	max
MIN	`1`	min
INC	`3`	inc
DEC	`4`	dec
AND	`5`	and
OR	`6`	or
XOR	`7`	xor

'nvvm' Dialect

Operations ¶

nvvm.bar.warp.sync (NVVM::SyncWarpOp) ¶

Operands: ¶

nvvm.barrier (NVVM::BarrierOp) ¶

Operands: ¶

nvvm.barrier.arrive (NVVM::BarrierArriveOp) ¶

Operands: ¶

nvvm.barrier0 (NVVM::Barrier0Op) ¶

nvvm.breakpoint (NVVM::Breakpoint) ¶

nvvm.cluster.arrive (NVVM::ClusterArriveOp) ¶

Attributes: ¶

nvvm.cluster.arrive.relaxed (NVVM::ClusterArriveRelaxedOp) ¶

Attributes: ¶

nvvm.cluster.wait (NVVM::ClusterWaitOp) ¶

Attributes: ¶

nvvm.convert.bf16x2.to.f8x2 (NVVM::ConvertBF16x2ToF8x2Op) ¶

Attributes: ¶

Operands: ¶

Results: ¶

nvvm.convert.f16x2.to.f8x2 (NVVM::ConvertF16x2ToF8x2Op) ¶

Attributes: ¶

Operands: ¶

Results: ¶

nvvm.convert.f32x2.to.f6x2 (NVVM::ConvertF32x2ToF6x2Op) ¶

Attributes: ¶

Operands: ¶

Results: ¶

nvvm.convert.f32x2.to.f8x2 (NVVM::ConvertF32x2ToF8x2Op) ¶

Attributes: ¶

Operands: ¶

Results: ¶

nvvm.convert.float.to.tf32 (NVVM::ConvertFloatToTF32Op) ¶

Attributes: ¶

Operands: ¶

Results: ¶

nvvm.cp.async.bulk.commit.group (NVVM::CpAsyncBulkCommitGroupOp) ¶

nvvm.cp.async.bulk.global.shared.cta (NVVM::CpAsyncBulkSharedCTAToGlobalOp) ¶

Operands: ¶

nvvm.cp.async.bulk.prefetch (NVVM::CpAsyncBulkPrefetchOp) ¶

Operands: ¶

nvvm.cp.async.bulk.shared.cluster.global (NVVM::CpAsyncBulkGlobalToSharedClusterOp) ¶

Operands: ¶

nvvm.cp.async.bulk.shared.cluster.shared.cta (NVVM::CpAsyncBulkSharedCTAToSharedClusterOp) ¶

Operands: ¶

nvvm.cp.async.bulk.tensor.global.shared.cta (NVVM::CpAsyncBulkTensorSharedCTAToGlobalOp) ¶

Operands: ¶

nvvm.cp.async.bulk.tensor.prefetch (NVVM::CpAsyncBulkTensorPrefetchOp) ¶

Operands: ¶

nvvm.cp.async.bulk.tensor.reduce (NVVM::CpAsyncBulkTensorReduceOp) ¶

Attributes: ¶

Operands: ¶

nvvm.cp.async.bulk.tensor.shared.cluster.global (NVVM::CpAsyncBulkTensorGlobalToSharedClusterOp) ¶

Operands: ¶

nvvm.cp.async.bulk.wait_group (NVVM::CpAsyncBulkWaitGroupOp) ¶

Attributes: ¶

nvvm.cp.async.commit.group (NVVM::CpAsyncCommitGroupOp) ¶

nvvm.cp.async.mbarrier.arrive (NVVM::CpAsyncMBarrierArriveOp) ¶

Attributes: ¶

Operands: ¶

nvvm.cp.async.mbarrier.arrive.shared (NVVM::CpAsyncMBarrierArriveSharedOp) ¶

Attributes: ¶

Operands: ¶

nvvm.cp.async.shared.global (NVVM::CpAsyncOp) ¶

Attributes: ¶

Operands: ¶

nvvm.cp.async.wait.group (NVVM::CpAsyncWaitGroupOp) ¶

Attributes: ¶

nvvm.dot.accumulate.2way (NVVM::DotAccumulate2WayOp) ¶

Attributes: ¶

Operands: ¶

Results: ¶

nvvm.dot.accumulate.4way (NVVM::DotAccumulate4WayOp) ¶

Attributes: ¶

Operands: ¶

Results: ¶

nvvm.elect.sync (NVVM::ElectSyncOp) ¶

Results: ¶

nvvm.exit (NVVM::Exit) ¶

nvvm.fence.mbarrier.init (NVVM::FenceMbarrierInitOp) ¶

`nvvm.bar.warp.sync` (NVVM::SyncWarpOp) ¶

`nvvm.barrier` (NVVM::BarrierOp) ¶

`nvvm.barrier.arrive` (NVVM::BarrierArriveOp) ¶

`nvvm.barrier0` (NVVM::Barrier0Op) ¶

`nvvm.breakpoint` (NVVM::Breakpoint) ¶

`nvvm.cluster.arrive` (NVVM::ClusterArriveOp) ¶

`nvvm.cluster.arrive.relaxed` (NVVM::ClusterArriveRelaxedOp) ¶

`nvvm.cluster.wait` (NVVM::ClusterWaitOp) ¶

`nvvm.convert.bf16x2.to.f8x2` (NVVM::ConvertBF16x2ToF8x2Op) ¶

`nvvm.convert.f16x2.to.f8x2` (NVVM::ConvertF16x2ToF8x2Op) ¶

`nvvm.convert.f32x2.to.f6x2` (NVVM::ConvertF32x2ToF6x2Op) ¶

`nvvm.convert.f32x2.to.f8x2` (NVVM::ConvertF32x2ToF8x2Op) ¶

`nvvm.convert.float.to.tf32` (NVVM::ConvertFloatToTF32Op) ¶

`nvvm.cp.async.bulk.commit.group` (NVVM::CpAsyncBulkCommitGroupOp) ¶

`nvvm.cp.async.bulk.global.shared.cta` (NVVM::CpAsyncBulkSharedCTAToGlobalOp) ¶

`nvvm.cp.async.bulk.prefetch` (NVVM::CpAsyncBulkPrefetchOp) ¶

`nvvm.cp.async.bulk.shared.cluster.global` (NVVM::CpAsyncBulkGlobalToSharedClusterOp) ¶

`nvvm.cp.async.bulk.shared.cluster.shared.cta` (NVVM::CpAsyncBulkSharedCTAToSharedClusterOp) ¶

`nvvm.cp.async.bulk.tensor.global.shared.cta` (NVVM::CpAsyncBulkTensorSharedCTAToGlobalOp) ¶

`nvvm.cp.async.bulk.tensor.prefetch` (NVVM::CpAsyncBulkTensorPrefetchOp) ¶

`nvvm.cp.async.bulk.tensor.reduce` (NVVM::CpAsyncBulkTensorReduceOp) ¶

`nvvm.cp.async.bulk.tensor.shared.cluster.global` (NVVM::CpAsyncBulkTensorGlobalToSharedClusterOp) ¶

`nvvm.cp.async.bulk.wait_group` (NVVM::CpAsyncBulkWaitGroupOp) ¶

`nvvm.cp.async.commit.group` (NVVM::CpAsyncCommitGroupOp) ¶

`nvvm.cp.async.mbarrier.arrive` (NVVM::CpAsyncMBarrierArriveOp) ¶

`nvvm.cp.async.mbarrier.arrive.shared` (NVVM::CpAsyncMBarrierArriveSharedOp) ¶

`nvvm.cp.async.shared.global` (NVVM::CpAsyncOp) ¶

`nvvm.cp.async.wait.group` (NVVM::CpAsyncWaitGroupOp) ¶

`nvvm.dot.accumulate.2way` (NVVM::DotAccumulate2WayOp) ¶

`nvvm.dot.accumulate.4way` (NVVM::DotAccumulate4WayOp) ¶

`nvvm.elect.sync` (NVVM::ElectSyncOp) ¶

`nvvm.exit` (NVVM::Exit) ¶

`nvvm.fence.mbarrier.init` (NVVM::FenceMbarrierInitOp) ¶

`nvvm.fence.proxy` (NVVM::FenceProxyOp) ¶

`nvvm.fence.proxy.acquire` (NVVM::FenceProxyAcquireOp) ¶

`nvvm.fence.proxy.release` (NVVM::FenceProxyReleaseOp) ¶

`nvvm.fence.sc.cluster` (NVVM::FenceScClusterOp) ¶

`nvvm.griddepcontrol.launch.dependents` (NVVM::GriddepcontrolLaunchDependentsOp) ¶

`nvvm.griddepcontrol.wait` (NVVM::GriddepcontrolWaitOp) ¶

`nvvm.inline_ptx` (NVVM::InlinePtxOp) ¶

`nvvm.ldmatrix` (NVVM::LdMatrixOp) ¶

`nvvm.mapa` (NVVM::MapaOp) ¶

`nvvm.match.sync` (NVVM::MatchSyncOp) ¶

`nvvm.mbarrier.arrive` (NVVM::MBarrierArriveOp) ¶

`nvvm.mbarrier.arrive.expect_tx` (NVVM::MBarrierArriveExpectTxOp) ¶

`nvvm.mbarrier.arrive.expect_tx.shared` (NVVM::MBarrierArriveExpectTxSharedOp) ¶

`nvvm.mbarrier.arrive.nocomplete` (NVVM::MBarrierArriveNocompleteOp) ¶

`nvvm.mbarrier.arrive.nocomplete.shared` (NVVM::MBarrierArriveNocompleteSharedOp) ¶

`nvvm.mbarrier.arrive.shared` (NVVM::MBarrierArriveSharedOp) ¶

`nvvm.mbarrier.init` (NVVM::MBarrierInitOp) ¶

`nvvm.mbarrier.init.shared` (NVVM::MBarrierInitSharedOp) ¶

`nvvm.mbarrier.inval` (NVVM::MBarrierInvalOp) ¶

`nvvm.mbarrier.inval.shared` (NVVM::MBarrierInvalSharedOp) ¶

`nvvm.mbarrier.test.wait` (NVVM::MBarrierTestWaitOp) ¶

`nvvm.mbarrier.test.wait.shared` (NVVM::MBarrierTestWaitSharedOp) ¶

`nvvm.mbarrier.try_wait.parity` (NVVM::MBarrierTryWaitParityOp) ¶

`nvvm.mbarrier.try_wait.parity.shared` (NVVM::MBarrierTryWaitParitySharedOp) ¶

`nvvm.mma.sync` (NVVM::MmaOp) ¶

`nvvm.prefetch` (NVVM::PrefetchOp) ¶

`nvvm.prefetch.tensormap` (NVVM::PrefetchTensorMapOp) ¶

`nvvm.rcp.approx.ftz.f` (NVVM::RcpApproxFtzF32Op) ¶

Symbol	Value	String
ADD	`0`	add
MAX	`2`	max
MIN	`1`	min
INC	`3`	inc
DEC	`4`	dec
AND	`5`	and
OR	`6`	or
XOR	`7`	xor