'x86' Dialect
Operations ¶
x86.amx.tile_load (x86::amx::TileLoadOp) ¶
Tile load operation
Syntax:
operation ::= `x86.amx.tile_load` $base `[` $indices `]` (`,` $stride^ )? attr-dict`:` type($base) `into` qualified(type($res))
Loads a tile from memory defined by a base and indices, with the
shape defined by the 2-dim vector type of the result.
The tile’s rows are populated by reading contiguous elements starting
at the base. For each tile row, the base is incremented by stride
number of elements.
The tile is loaded using the following indexing scheme:
for row in enumerate(tile_rows):
mem_row = base[i0, i1, ..., iN + row * stride]
for col in enumerate(tile_cols):
tile[row, col] = mem_row[col]
If the stride is not provided, then the base buffer must be at least
2-dimensional, and the stride is automatically inferred and corresponds
to the stride of the buffer’s second innermost dimension.
The operation is eventually lowered into the “tileloadd” instruction with the corresponding tile configuration.
With the write memory effect, each x86.amx.tile_load operation serves as
a compilation hint to use a separate tile register.
Example:
// Tile load from a 2-D memref with implicit stride.
%0 = x86.amx.tile_load %arg0[%c0, %c0] : memref<?x?xi8> into !x86.amx.tile<16x64xi8>
// Tile load from a 1-D memref with explicit stride.
%0 = x86.amx.tile_load %arg0[%c0], %stride : memref<?xi8> into !x86.amx.tile<16x64xi8>
Traits: AttrSizedOperandSegments
Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface), OneToOneIntrinsicOpInterface, X86IntrinsicOpInterface
Effects: MemoryEffects::Effect{MemoryEffects::Write on ::mlir::SideEffects::DefaultResource}
Operands: ¶
| Operand | Description |
|---|---|
base | memref of any type values |
indices | variadic of index |
stride | index |
Results: ¶
| Result | Description |
|---|---|
res | tile of 32-bit float or 16-bit float or bfloat16 type or 32-bit signless integer or 8-bit signless integer values |
x86.amx.tile_mulf (x86::amx::TileMulFOp) ¶
Tile multiplication operation (floating-point)
Syntax:
operation ::= `x86.amx.tile_mulf` $lhs `,` $rhs `,` $acc attr-dict `:` qualified(type($lhs)) `,` qualified(type($rhs)) `,` qualified(type($acc))
Multiplies a “m x k” tile with a “k x n” tile and accumulates the results into a “m x n” destination tile. Supports “f32 <- bf16 x bf16” (with pairs of “bf16”).
The operation is eventually lowered into the “tdpbf16ps” instruction with the corresponding tile configuration.
Example:
%0 = x86.amx.tile_mulf %a, %b, %c
: !x86.amx.tile<16x32xbf16>, !x86.amx.tile<16x32xbf16>, !x86.amx.tile<16x16xf32>
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OneToOneIntrinsicOpInterface, X86IntrinsicOpInterface
Effects: MemoryEffects::Effect{}
Operands: ¶
| Operand | Description |
|---|---|
lhs | tile of 16-bit float or bfloat16 type values |
rhs | tile of 16-bit float or bfloat16 type values |
acc | tile of 32-bit float values |
Results: ¶
| Result | Description |
|---|---|
res | tile of 32-bit float values |
x86.amx.tile_muli (x86::amx::TileMulIOp) ¶
Tile multiplication operation (integer)
Syntax:
operation ::= `x86.amx.tile_muli` $lhs (`zext` $isZextLhs^)? `,` $rhs (`zext` $isZextRhs^)? `,` $acc attr-dict `:` qualified(type($lhs)) `,` qualified(type($rhs)) `,` qualified(type($acc))
Multiplies a “m x k” tile with a “k x n” tile and accumulates the results into a “m x n” destination tile. Supports all “si32 <- s/ui8 x s/ui8” combinations (4 bytes packed into dwords in the columns of both the source operand tiles; the zero or sign extension is specified with the attributes and default to sign extended).
The operation is eventually lowered into one of the “tdpbssd”, “tdpbsud”, “tdpbusd”, or “tdpbuud” instructions with the corresponding tile configuration.
Example:
%0 = x86.amx.tile_muli %a zext, %b zext, %c
: !x86.amx.tile<16x64xi8>, !x86.amx.tile<16x64xi8>, !x86.amx.tile<16x16xi32>
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OneToOneIntrinsicOpInterface, X86IntrinsicOpInterface
Effects: MemoryEffects::Effect{}
Attributes: ¶
| Attribute | MLIR Type | Description |
|---|---|---|
isZextLhs | ::mlir::UnitAttr | unit attribute |
isZextRhs | ::mlir::UnitAttr | unit attribute |
Operands: ¶
| Operand | Description |
|---|---|
lhs | tile of 8-bit signless integer values |
rhs | tile of 8-bit signless integer values |
acc | tile of 32-bit signless integer values |
Results: ¶
| Result | Description |
|---|---|
res | tile of 32-bit signless integer values |
x86.amx.tile_store (x86::amx::TileStoreOp) ¶
Tile store operation
Syntax:
operation ::= `x86.amx.tile_store` $base `[` $indices `]` `,` $val (`,` $stride^ )?attr-dict `:` type($base) `,` qualified(type($val))
Stores a tile to memory defined by a base and indices, with the
shape defined by the 2-dim vector type of the value.
The tile’s rows are written contiguously to the buffer starting at
the base. For each tile row, the base is incremented by stride
number of elements.
The tile is stored using the following indexing scheme:
for row in enumerate(tile_rows):
mem_row = base[i0, i1, ..., iN + row * stride]
for col in enumerate(tile_cols):
mem_row[col] = tile[row, col]
If the stride is not provided, then the base buffer must be at least
2-dimensional, and the stride is automatically inferred and corresponds
to the stride of the buffer’s second innermost dimension.
The operation is eventually lowered into the “tilestored” instruction with the corresponding tile configuration.
Example:
// Tile store to a 2-D memref with implicit stride.
x86.amx.tile_store %arg1[%c0, %c0], %0 : memref<?x?xi8>, !x86.amx.tile<16x64xi8>
// Tile store to a 1-D memref with explicit stride.
x86.amx.tile_store %arg1[%c0], %0, %stride : memref<?xi8>, !x86.amx.tile<16x64xi8>
Traits: AttrSizedOperandSegments
Interfaces: OneToOneIntrinsicOpInterface, X86IntrinsicOpInterface
Operands: ¶
| Operand | Description |
|---|---|
base | memref of any type values |
indices | variadic of index |
val | tile of 32-bit float or 16-bit float or bfloat16 type or 32-bit signless integer or 8-bit signless integer values |
stride | index |
x86.amx.tile_zero (x86::amx::TileZeroOp) ¶
Tile zero operation
Syntax:
operation ::= `x86.amx.tile_zero` attr-dict `:` qualified(type($res))
Zeroes the destination tile, with the shape defined by the 2-dim vector type of the result.
The operation is eventually lowered into the “tilezero” instruction with the corresponding tile configuration.
With the write memory effect, each x86.amx.tile_zero operation serves as
a compilation hint to use a separate tile register.
Example:
%0 = x86.amx.tile_zero : !x86.amx.tile<16x16xbf16>
Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface), OneToOneIntrinsicOpInterface, X86IntrinsicOpInterface
Effects: MemoryEffects::Effect{MemoryEffects::Write on ::mlir::SideEffects::DefaultResource}
Results: ¶
| Result | Description |
|---|---|
res | tile of 32-bit float or 16-bit float or bfloat16 type or 32-bit signless integer or 8-bit signless integer values |
x86.avx.bcst_to_f32.packed (x86::BcstToPackedF32Op) ¶
AVX: Broadcasts BF16/F16 into packed F32 Data.
Syntax:
operation ::= `x86.avx.bcst_to_f32.packed` $a attr-dict`:` type($a)`->` type($dst)
From the Intel Intrinsics Guide: ¶
Convert scalar BF16 or F16 (16-bit) floating-point element stored at memory locations
starting at location __A to a single-precision (32-bit) floating-point,
broadcast it to packed single-precision (32-bit) floating-point elements,
and store the results in dst.
Example:
%dst = x86.avx.bcst_to_f32.packed %a : memref<1xbf16> -> vector<8xf32>
%dst = x86.avx.bcst_to_f32.packed %a : memref<1xf16> -> vector<8xf32>
Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface), OneToOneIntrinsicOpInterface, X86IntrinsicOpInterface
Effects: MemoryEffects::Effect{MemoryEffects::Read on ::mlir::SideEffects::DefaultResource}
Operands: ¶
| Operand | Description |
|---|---|
a | memref of bfloat16 type or 16-bit float values |
Results: ¶
| Result | Description |
|---|---|
dst | vector of 32-bit float values of length 4/8 |
x86.avx.cvt.packed.even.indexed_to_f32 (x86::CvtPackedEvenIndexedToF32Op) ¶
AVX: Convert packed BF16/F16 even-indexed elements into packed F32 Data.
Syntax:
operation ::= `x86.avx.cvt.packed.even.indexed_to_f32` $a attr-dict`:` type($a)`->` type($dst)
From the Intel Intrinsics Guide: ¶
Convert packed BF16 or F16 (16-bit) floating-point even-indexed elements stored at
memory locations starting at location __A to packed single-precision
(32-bit) floating-point elements, and store the results in dst.
Example:
%dst = x86.avx.cvt.packed.even.indexed_to_f32 %a : memref<16xbf16> -> vector<8xf32>
%dst = x86.avx.cvt.packed.even.indexed_to_f32 %a : memref<16xf16> -> vector<8xf32>
Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface), OneToOneIntrinsicOpInterface, X86IntrinsicOpInterface
Effects: MemoryEffects::Effect{MemoryEffects::Read on ::mlir::SideEffects::DefaultResource}
Operands: ¶
| Operand | Description |
|---|---|
a | memref of bfloat16 type or 16-bit float values |
Results: ¶
| Result | Description |
|---|---|
dst | vector of 32-bit float values of length 4/8 |
x86.avx.cvt.packed.odd.indexed_to_f32 (x86::CvtPackedOddIndexedToF32Op) ¶
AVX: Convert packed BF16/F16 odd-indexed elements into packed F32 Data.
Syntax:
operation ::= `x86.avx.cvt.packed.odd.indexed_to_f32` $a attr-dict`:` type($a)`->` type($dst)
From the Intel Intrinsics Guide: ¶
Convert packed BF16 or F16 (16-bit) floating-point odd-indexed elements stored at
memory locations starting at location __A to packed single-precision
(32-bit) floating-point elements, and store the results in dst.
Example:
%dst = x86.avx.cvt.packed.odd.indexed_to_f32 %a : memref<16xbf16> -> vector<8xf32>
%dst = x86.avx.cvt.packed.odd.indexed_to_f32 %a : memref<16xf16> -> vector<8xf32>
Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface), OneToOneIntrinsicOpInterface, X86IntrinsicOpInterface
Effects: MemoryEffects::Effect{MemoryEffects::Read on ::mlir::SideEffects::DefaultResource}
Operands: ¶
| Operand | Description |
|---|---|
a | memref of bfloat16 type or 16-bit float values |
Results: ¶
| Result | Description |
|---|---|
dst | vector of 32-bit float values of length 4/8 |
x86.avx.dot.i8 (x86::DotInt8Op) ¶
Dot Int8 op
Syntax:
operation ::= `x86.avx.dot.i8` $w `,` $a `,` $b attr-dict `:` type($a) `->` type($w)
The dot op is an AVX2-Int8 specific op that can lower to the proper
LLVMAVX2-INT8 operation llvm.vpdpbssd depending on the width of MLIR
vectors it is applied to.
From the Intel Intrinsics Guide: ¶
Multiply groups of 4 adjacent pairs of signed 8-bit integers in a with
corresponding signed 8-bit integers in b, producing 4 intermediate signed 16-bit
results. Sum these 4 results with the corresponding 32-bit integer in w, and
store the packed 32-bit results in dst.
Example:
%dst = x86.avx.dot.i8 %w, %a, %b : vector<32xi8> -> vector<8xi32>
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OneToOneIntrinsicOpInterface, X86IntrinsicOpInterface
Effects: MemoryEffects::Effect{}
Operands: ¶
| Operand | Description |
|---|---|
w | vector of 32-bit signless integer values of length 4/8 |
a | vector of 8-bit signless integer values of length 16/32 |
b | vector of 8-bit signless integer values of length 16/32 |
Results: ¶
| Result | Description |
|---|---|
dst | vector of 32-bit signless integer values of length 4/8 |
x86.avx.intr.dot (x86::DotOp) ¶
Dot
Syntax:
operation ::= `x86.avx.intr.dot` $a `,` $b attr-dict `:` type($res)
Computes the 4-way dot products of the lower and higher parts of the source vectors and broadcasts the two results to the lower and higher elements of the destination vector, respectively. Adding one element of the lower part to one element of the higher part in the destination vector yields the full dot product of the two source vectors.
Example:
%0 = x86.avx.intr.dot %a, %b : vector<8xf32>
%1 = vector.extract %0[%i0] : f32 from vector<8xf32>
%2 = vector.extract %0[%i4] : f32 from vector<8xf32>
%d = arith.addf %1, %2 : f32
Traits: AlwaysSpeculatableImplTrait, SameOperandsAndResultType
Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OneToOneIntrinsicOpInterface, X86IntrinsicOpInterface
Effects: MemoryEffects::Effect{}
Operands: ¶
| Operand | Description |
|---|---|
a | vector of 32-bit float values of length 8 |
b | vector of 32-bit float values of length 8 |
Results: ¶
| Result | Description |
|---|---|
res | vector of 32-bit float values of length 8 |
x86.avx.rsqrt (x86::RsqrtOp) ¶
Rsqrt
Syntax:
operation ::= `x86.avx.rsqrt` $a attr-dict `:` type($a)
Traits: AlwaysSpeculatableImplTrait, SameOperandsAndResultType
Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OneToOneIntrinsicOpInterface, X86IntrinsicOpInterface
Effects: MemoryEffects::Effect{}
Operands: ¶
| Operand | Description |
|---|---|
a | vector of 32-bit float values of length 8 |
Results: ¶
| Result | Description |
|---|---|
b | vector of 32-bit float values of length 8 |
x86.avx10.dot.i8 (x86::AVX10DotInt8Op) ¶
AVX10 Dot Int8 op
Syntax:
operation ::= `x86.avx10.dot.i8` $w `,` $a `,` $b attr-dict `:` type($a) `->` type($w)
The dot op is an AVX10-Int8 specific op that can lower to the proper
LLVMAVX10-INT8 operation llvm.vpdpbssd.512.
Multiply groups of 4 adjacent pairs of signed 8-bit integers in a with
corresponding signed 8-bit integers in b, producing 4 intermediate signed 16-bit
results. Sum these 4 results with the corresponding 32-bit integer in w, and
store the packed 32-bit results in dst.
Example:
%dst = x86.avx10.dot.i8 %w, %a, %b : vector<64xi8> -> vector<16xi32>
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OneToOneIntrinsicOpInterface, X86IntrinsicOpInterface
Effects: MemoryEffects::Effect{}
Operands: ¶
| Operand | Description |
|---|---|
w | vector of 32-bit signless integer values of length 16 |
a | vector of 8-bit signless integer values of length 64 |
b | vector of 8-bit signless integer values of length 64 |
Results: ¶
| Result | Description |
|---|---|
dst | vector of 32-bit signless integer values of length 16 |
x86.avx512.cvt.packed.f32_to_bf16 (x86::CvtPackedF32ToBF16Op) ¶
Convert packed F32 to packed BF16 Data.
Syntax:
operation ::= `x86.avx512.cvt.packed.f32_to_bf16` $a attr-dict `:` type($a) `->` type($dst)
The convert_f32_to_bf16 op is an AVX512-BF16 specific op that can lower
to the proper LLVMAVX512BF16 operation llvm.cvtneps2bf16 depending on
the width of MLIR vectors it is applied to.
From the Intel Intrinsics Guide: ¶
Convert packed single-precision (32-bit) floating-point elements in a to
packed BF16 (16-bit) floating-point elements, and store the results in dst.
Example:
%dst = x86.avx512.cvt.packed.f32_to_bf16 %a : vector<8xf32> -> vector<8xbf16>
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface), OneToOneIntrinsicOpInterface, X86IntrinsicOpInterface
Effects: MemoryEffects::Effect{}
Operands: ¶
| Operand | Description |
|---|---|
a | vector of 32-bit float values of length 8/16 |
Results: ¶
| Result | Description |
|---|---|
dst | vector of bfloat16 type values of length 8/16 |
x86.avx512.dot (x86::DotBF16Op) ¶
Dot BF16 op
Syntax:
operation ::= `x86.avx512.dot` $src `,` $a `,` $b attr-dict `:` type($a) `->` type($src)
The dot op is an AVX512-BF16 specific op that can lower to the proper
LLVMAVX512BF16 operation llvm.dpbf16ps depending on the width of MLIR
vectors it is applied to.
From the Intel Intrinsics Guide: ¶
Compute dot-product of BF16 (16-bit) floating-point pairs in a and b,
accumulating the intermediate single-precision (32-bit) floating-point
elements with elements in src, and store the results in dst.
Example:
%dst = x86.avx512.dot %src, %a, %b : vector<32xbf16> -> vector<16xf32>
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OneToOneIntrinsicOpInterface, X86IntrinsicOpInterface
Effects: MemoryEffects::Effect{}
Operands: ¶
| Operand | Description |
|---|---|
src | vector of 32-bit float values of length 4/8/16 |
a | vector of bfloat16 type values of length 8/16/32 |
b | vector of bfloat16 type values of length 8/16/32 |
Results: ¶
| Result | Description |
|---|---|
dst | vector of 32-bit float values of length 4/8/16 |
x86.avx512.mask.compress (x86::MaskCompressOp) ¶
Masked compress op
Syntax:
operation ::= `x86.avx512.mask.compress` $k `,` $a (`,` $src^)? attr-dict `:` type($dst) (`,` type($src)^)?
The mask.compress op is an AVX512 specific op that can lower to the
llvm.mask.compress instruction. Instead of src, a constant vector
vector attribute constant_src may be specified. If neither src nor
constant_src is specified, the remaining elements in the result vector are
set to zero.
From the Intel Intrinsics Guide: ¶
Contiguously store the active integer/floating-point elements in a (those
with their respective bit set in writemask k) to dst, and pass through the
remaining elements from src.
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OneToOneIntrinsicOpInterface, X86IntrinsicOpInterface
Effects: MemoryEffects::Effect{}
Attributes: ¶
| Attribute | MLIR Type | Description |
|---|---|---|
constant_src | ::mlir::ElementsAttr | constant vector/tensor attribute |
Operands: ¶
| Operand | Description |
|---|---|
k | vector of 1-bit signless integer values of length 16/8 |
a | vector of 32-bit float or 32-bit signless integer or 64-bit float or 64-bit signless integer values of length 16/8 |
src | vector of 32-bit float or 32-bit signless integer or 64-bit float or 64-bit signless integer values of length 16/8 |
Results: ¶
| Result | Description |
|---|---|
dst | vector of 32-bit float or 32-bit signless integer or 64-bit float or 64-bit signless integer values of length 16/8 |
x86.avx512.mask.rndscale (x86::MaskRndScaleOp) ¶
Masked roundscale op
Syntax:
operation ::= `x86.avx512.mask.rndscale` $src `,` $k `,` $a `,` $imm `,` $rounding attr-dict `:` type($dst)
The mask.rndscale op is an AVX512 specific op that can lower to the proper
LLVMAVX512 operation: llvm.mask.rndscale.ps.512 or
llvm.mask.rndscale.pd.512 instruction depending on the type of vectors it
is applied to.
From the Intel Intrinsics Guide: ¶
Round packed floating-point elements in a to the number of fraction bits
specified by imm, and store the results in dst using writemask k
(elements are copied from src when the corresponding mask bit is not set).
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OneToOneIntrinsicOpInterface, X86IntrinsicOpInterface
Effects: MemoryEffects::Effect{}
Operands: ¶
| Operand | Description |
|---|---|
src | vector of 32-bit float or 64-bit float values of length 16/8 |
k | 32-bit signless integer |
a | vector of 32-bit float or 64-bit float values of length 16/8 |
imm | 16-bit signless integer or 8-bit signless integer |
rounding | 32-bit signless integer |
Results: ¶
| Result | Description |
|---|---|
dst | vector of 32-bit float or 64-bit float values of length 16/8 |
x86.avx512.mask.scalef (x86::MaskScaleFOp) ¶
ScaleF op
Syntax:
operation ::= `x86.avx512.mask.scalef` $src `,` $a `,` $b `,` $k `,` $rounding attr-dict `:` type($dst)
The mask.scalef op is an AVX512 specific op that can lower to the proper
LLVMAVX512 operation: llvm.mask.scalef.ps.512 or
llvm.mask.scalef.pd.512 depending on the type of MLIR vectors it is
applied to.
From the Intel Intrinsics Guide: ¶
Scale the packed floating-point elements in a using values from b, and
store the results in dst using writemask k (elements are copied from src
when the corresponding mask bit is not set).
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OneToOneIntrinsicOpInterface, X86IntrinsicOpInterface
Effects: MemoryEffects::Effect{}
Operands: ¶
| Operand | Description |
|---|---|
src | vector of 32-bit float or 64-bit float values of length 16/8 |
a | vector of 32-bit float or 64-bit float values of length 16/8 |
b | vector of 32-bit float or 64-bit float values of length 16/8 |
k | 16-bit signless integer or 8-bit signless integer |
rounding | 32-bit signless integer |
Results: ¶
| Result | Description |
|---|---|
dst | vector of 32-bit float or 64-bit float values of length 16/8 |
x86.avx512.vp2intersect (x86::Vp2IntersectOp) ¶
Vp2Intersect op
Syntax:
operation ::= `x86.avx512.vp2intersect` $a `,` $b attr-dict `:` type($a)
The vp2intersect op is an AVX512 specific op that can lower to the proper
LLVMAVX512 operation: llvm.vp2intersect.d.512 or
llvm.vp2intersect.q.512 depending on the type of MLIR vectors it is
applied to.
From the Intel Intrinsics Guide: ¶
Compute intersection of packed integer vectors a and b, and store
indication of match in the corresponding bit of two mask registers
specified by k1 and k2. A match in corresponding elements of a and
b is indicated by a set bit in the corresponding bit of the mask
registers.
Traits: AlwaysSpeculatableImplTrait
Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface), OneToOneIntrinsicOpInterface, X86IntrinsicOpInterface
Effects: MemoryEffects::Effect{}
Operands: ¶
| Operand | Description |
|---|---|
a | vector of 32-bit signless integer or 64-bit signless integer values of length 16/8 |
b | vector of 32-bit signless integer or 64-bit signless integer values of length 16/8 |
Results: ¶
| Result | Description |
|---|---|
k1 | vector of 1-bit signless integer values of length 16/8 |
k2 | vector of 1-bit signless integer values of length 16/8 |
Types ¶
AMXTileType ¶
AMX 2D tile to be used by AMX opertaions.
This type is used to represent values in AMX tile registers. All AMX operations work on AMX tiles and these tiles cannot be used in other operations directly. LLVM IR type for AMX tile is a primitive type, but in MLIR we provide shape and element type for IR verification and lowering to LLVMIR dialect.
Parameters: ¶
| Parameter | C++ type | Description |
|---|---|---|
| shape | ::llvm::ArrayRef<int64_t> | |
| elementType | ::mlir::Type | 32-bit float or 16-bit float or bfloat16 type or 32-bit signless integer or 8-bit signless integer |
MLIR