'ArmSME' Dialect

Basic dialect to target Arm SME.

This dialect defines custom and LLVM IR intrinsic operations that are used to target Arm Scalable Matrix Extension. Through the available conversion and ArmSME passes you can, for example, lower a linalg.matmul operation to Arm SME FMOPA (floating-point outer product) operations. See one of the in-tree end-to-end integration tests for reference:

In order to run ArmSME integration tests, include these flags in the CMake invocation when configuring LLVM and MLIR:

  -DMLIR_INCLUDE_INTEGRATION_TESTS=On
  -DMLIR_RUN_ARM_SME_TESTS=On
  -DARM_EMULATOR_EXECUTABLE=<path-to-emulator>

These tests are run “post-commit” by the clang-aarch64-sve-vla LLVM BuildBot worker.

References:

Operations ¶

source

`arm_sme.copy_tile` (arm_sme::CopyTileOp) ¶

Copies an SME tile value

Syntax:

operation ::= `arm_sme.copy_tile` $tile attr-dict `:` type($result)

Copies an SME “virtual tile” value to a new SSA value. This operation is primarily intended to be used to normalize the IR prior to tile allocation.

Example:

%copy = arm_sme.copy_tile %tile : vector<[4]x[4]xf32>

Traits: AlwaysSpeculatableImplTrait

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: ¶

Operand	Description
`tile`	a vector type that fits into a SME tile

Results: ¶

Result	Description
`result`	a vector type that fits into a SME tile

`arm_sme.extract_tile_slice` (arm_sme::ExtractTileSliceOp) ¶

Extract 1-D scalable vector from slice of 2-D tile

Syntax:

operation ::= `arm_sme.extract_tile_slice` $tile `[` $tile_slice_index `]` (`layout` `` $layout^)? attr-dict
              `:` type($result) `from` type($tile)

Extracts a 1-D scalable slice from a 2-D scalable tile at the given index. A tile slice is a 1-D vector of horizontally or vertically contiguous elements within a ZA tile.

An optional tile slice layout attribute specifies whether the tile slice is horizontal (default) or vertical.

Example 1: Extract vector<[16]xi8> from tile horizontally at the given index.

%slice = arm_sme.extract_tile_slice %tile[%tile_slice_index] : vector<[16]xi8> from vector<[16]x[16]xi8>

Example 2: Extract vector<[2]xf64> from tile vertically at the given index.

%slice = arm_sme.extract_tile_slice %tile[%tile_slice_index] layout<vertical> : vector<[2]xf64> from vector<[2]x[2]xf64>

Traits: AlwaysSpeculatableImplTrait

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute	MLIR Type	Description
`layout`	::mlir::arm_sme::TileSliceLayoutAttr	Layout of a tile slice

Operands: ¶

Operand	Description
`tile`	a vector type that fits into a SME tile
`tile_slice_index`	index

Results: ¶

Result	Description
`result`	a vector type that matches the size of a SVE vector

`arm_sme.fmopa_2way` (arm_sme::FMopa2WayOp) ¶

Floating-point sum of 2 outer products and accumulate

Syntax:

operation ::= `arm_sme.fmopa_2way` $lhs `,` $rhs
              oilist(
              `acc` `` `(` $acc `)`
              | `masks` `` `(` $lhsMask `,` $rhsMask `)`
              ) attr-dict `:` type($lhs) `,` type($rhs) `into` type($result)

This operation represents a sum of 2 widened outer products. It takes 2 1-D scalable vectors as input and a 2-D scalable vector (ZA tile) as output.

For example (fp16 to fp32):

%result = arm_sme.fmopa_2way %lhs, %rhs :
  vector<[8]xf16>, vector<[8]xf16> into vector<[4]x[4]xf32>

The lhs encodes a matrix of shape SVLSx2 and the rhs a matrix of 2xSVLS, where SVLS (spec [1], section B2.1) is the number of 32-bit elements in a vector of SVL bits. To illustrate, below is a breakdown of this operation for fp16 to fp32, SVL=128 (i.e., vscale=1):

                      LHS                          RHS
           [A0 A1 A2 A3 A4 A5 A6 A7]    [B0 B1 B2 B3 B4 B5 B6 B7]

----------------------------------------------------------------------------

                              implicit layout

                          [A0 A1]    |
                          [A2 A3]    |    [B0 B2 B4 B6]
                          [A4 A5]    |    [B1 B3 B5 B7]
                          [A6 A7]    |

----------------------------------------------------------------------------

                              2 outer products

                  Acol0 ⊗ Brow0      |           Acol1 ⊗ Brow1
                  -------------      |           -------------
                                     |
              [B0 B2 B4 B6]          |       [B1 B3 B5 B7]
                                     |
         [A0  [A0B0 A0B2 A0B4 A0B6]  |  [A1  [A1B1 A1B3 A1B5 A1B7]
          A2  [A2B0 A2B2 A2B4 A2B6]  |   A3  [A3B1 A3B3 A3B5 A3B7]
          A4  [A4B0 A4B2 A4B4 A4B6]  |   A5  [A5B1 A5B3 A5B5 A5B7]
          A6] [A6B0 A6B2 A6B4 A6B6]  |   A7] [A7B1 A7B3 A7B5 A7B7]
                                     |

----------------------------------------------------------------------------

                          sum of 2 outer products

                       Acol0 ⊗ Brow0 + Acol1 ⊗ Brow1

             [A0B0 + A1B1 A0B2 + A1B3 A0B4 + A1B5 A0B6 + A1B7]
             [A2B0 + A3B1 A2B2 + A3B3 A2B4 + A3B5 A2B6 + A3B7]
             [A4B0 + A5B1 A4B2 + A5B3 A4B4 + A5B5 A4B6 + A5B7]
             [A6B0 + A7B1 A6B2 + A7B3 A6B4 + A7B5 A6B6 + A7B7]

----------------------------------------------------------------------------

This operation enables the folding of 2 outer products chained via the accumulator into a single outer product.

For example:

%a0_ext = arith.extf %a0 : vector<[4]xf16> to vector<[4]xf32>
%b0_ext = arith.extf %b0 : vector<[4]xf16> to vector<[4]xf32>
%a1_ext = arith.extf %a1 : vector<[4]xf16> to vector<[4]xf32>
%b1_ext = arith.extf %b1 : vector<[4]xf16> to vector<[4]xf32>

%0 = arm_sme.outerproduct %a0_ext, %b0_ext : vector<[4]xf32>, vector<[4]xf32>
%1 = arm_sme.outerproduct %a1_ext, %b1_ext acc(%0) : vector<[4]xf32>, vector<[4]xf32>

The 2 outer products in the example above can be fused into a single outer product as follows:

%a_packed = vector.interleave %a0, %a1 : vector<[4]xf16> -> vector<[8]xf16>
%b_packed = vector.interleave %b0, %b1 : vector<[4]xf16> -> vector<[8]xf16>
%0 = arm_sme.fmopa_2way %a_packed, %b_packed : vector<[8]xf16>, vector<[8]xf16> into vector<[4]x[4]xf32>

This is implemented in the -arm-sme-outer-product-fusion pass.

Example: FP16 to FP32

%result = arm_sme.fmopa_2way $lhs, $rhs : vector<[8]xf16>, vector<[8]xf16> into vector<[4]x[4]xf32>

Example: BF16 to FP32

%result = arm_sme.fmopa_2way $lhs, $rhs : vector<[8]xbf16>, vector<[8]xbf16> into vector<[4]x[4]xf32>

Spec	Features
FMOPA (widening, 2-way, FP16 to FP32)	+sme
BFMOPA (widening, 2-way, BF16 to FP32)	+sme

[1] https://developer.arm.com/documentation/ddi0616

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: ¶

Operand	Description
`lhs`	of ranks 1scalable vector of 16-bit float or bfloat16 type values of length 8
`rhs`	vector of any type values
`lhsMask`	vector of any type values
`rhsMask`	vector of any type values
`acc`	vector of any type values

Results: ¶

Result	Description
`result`	vector<[4]x[4]xf32> of 32-bit float values

`arm_sme.fmops_2way` (arm_sme::FMops2WayOp) ¶

Floating-point sum of 2 outer products and subtract

Syntax:

operation ::= `arm_sme.fmops_2way` $lhs `,` $rhs
              oilist(
              `acc` `` `(` $acc `)`
              | `masks` `` `(` $lhsMask `,` $rhsMask `)`
              ) attr-dict `:` type($lhs) `,` type($rhs) `into` type($result)

Equivalent to fmopa_2way but outer products are subtracted from destination result.

Example: FP16 to FP32

%result = arm_sme.fmops_2way $lhs, $rhs : vector<[8]xf16>, vector<[8]xf16> into vector<[4]x[4]xf32>

Example: BF16 to FP32

%result = arm_sme.fmops_2way $lhs, $rhs : vector<[8]xbf16>, vector<[8]xbf16> into vector<[4]x[4]xf32>

Refer to fmopa_2way for a detailed description of 2-way outer products.

Spec	Features
FMOPS (widening, 2-way, FP16 to FP32)	+sme
BFMOPS (widening, 2-way, BF16 to FP32)	+sme

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: ¶

Operand	Description
`lhs`	of ranks 1scalable vector of 16-bit float or bfloat16 type values of length 8
`rhs`	vector of any type values
`lhsMask`	vector of any type values
`rhsMask`	vector of any type values
`acc`	vector of any type values

Results: ¶

Result	Description
`result`	vector<[4]x[4]xf32> of 32-bit float values

`arm_sme.get_tile` (arm_sme::GetTileOp) ¶

Creates an undefined value of SME virtual tile type

Syntax:

operation ::= `arm_sme.get_tile` attr-dict `:` type($tile)

Creates a new SME “virtual tile” value within a function. The contents of the tile returned from this operation are undefined.

Example 1:

// Create an 8-bit element "virtual tile" value:
%za0_b = arm_sme.get_tile: vector<[16]x[16]xi8>

Example 2:

// Create two 16-bit element "virtual tiles" values:
%za0_h = arm_sme.get_tile : vector<[8]x[8]xi16>
%za1_h = arm_sme.get_tile : vector<[8]x[8]xi16>

Example 3:

// Create an 128-bit element "virtual tile" value:
%za0_q = arm_sme.get_tile : vector<[1]x[1]xi128>

Traits: AlwaysSpeculatableImplTrait

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`tile`	a vector type that fits into a SME tile

`arm_sme.insert_tile_slice` (arm_sme::InsertTileSliceOp) ¶

Insert 1-D scalable vector into slice of 2-D tile

Syntax:

operation ::= `arm_sme.insert_tile_slice` $vector `,` $tile `[` $tile_slice_index `]` (`layout` `` $layout^)?
              attr-dict `:` type($vector) `into` type($result)

Inserts a 1-D scalable vector into a slice of a 2-D scalable vector tile at the given index. The type of the 1-D scalable vector to be inserted must match the type of the tile slice. A tile slice is a 1-D vector of horizontally or vertically contiguous elements within a ZA tile. The updated tile is returned as the result.

An optional tile slice layout attribute specifies whether the tile slice is horizontal (default) or vertical.

Example 1: Insert vector<[16]xi8> into tile horizontally at the given index.

%tile_update = arm_sme.insert_tile_slice %vector, %tile[%tile_slice_index] : vector<[16]xi8> into vector<[16]x[16]xi8>

Example 2: Insert vector<[2]xf64> into tile vertically at the given index.

%tile_update = arm_sme.insert_tile_slice %vector, %tile[%tile_slice_index] layout<vertical> : vector<[2]xf64> into vector<[2]x[2]xf64>

Traits: AlwaysSpeculatableImplTrait

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute	MLIR Type	Description
`layout`	::mlir::arm_sme::TileSliceLayoutAttr	Layout of a tile slice

Operands: ¶

Operand	Description
`vector`	a vector type that matches the size of a SVE vector
`tile`	a vector type that fits into a SME tile
`tile_slice_index`	index

Results: ¶

Result	Description
`result`	a vector type that fits into a SME tile

`arm_sme.load_tile_slice` (arm_sme::LoadTileSliceOp) ¶

Tile slice load and update operation

Syntax:

operation ::= `arm_sme.load_tile_slice` $base `[` $indices `]` `,` $mask `,` $tile `,` $tile_slice_index
              (`layout` `` $layout^)? attr-dict `:` type($base) `,` type($mask) `,`
              type($result)

Loads a 1D tile slice from memory into a 2D SME “virtual tile”. The tile slice is defined by the dimension of the 2D scalable vector type pointed by the index. A tile slice index describes where in the input tile the tile slice is loaded to. An optional tile slice layout attribute specifies whether the tile slice being loaded at the given index is horizontal (default) or vertical. The updated tile is returned as the result.

The slice of memory read is defined by a base and indices and must be contiguous. The memref must be either rank 1 or rank 2, have dynamic dimensions since the operation is scalable, and the element type must be a scalar that matches the element type of the result.

The provided mask is used to specify which elements of the tile slice will be loaded.

Example 1: Load a vector<[16]xi8> tile slice from memory into tile horizontally (default) at given index.

%tile_update = arm_sme.load_tile_slice %base[%c0], %mask, %tile, %tile_slice_index : memref<?x?xi8>, vector<[16]xi1>, vector<[16]x[16]xi8>

Example 2: Load a vector<[4]xf32> tile slice from memory into tile vertically at given index.

%tile_update = arm_sme.load_tile_slice %base[%c0], %mask, %tile, %tile_slice_index layout<vertical> : memref<?x?xf32>, vector<[4]xi1>, vector<[4]x[4]xf32>

Example 3: Load a vector<[1]xi128> tile slice from memory into tile vertically at given index.

%tile_update = arm_sme.load_tile_slice %base[%c0], %mask, %tile, %tile_slice_index layout<vertical> : memref<?x?xi128>, vector<[1]xi1>, vector<[1]x[1]xi128>

Interfaces: ArmSMETileOpInterface, InferTypeOpInterface

Attributes: ¶

Attribute	MLIR Type	Description
`layout`	::mlir::arm_sme::TileSliceLayoutAttr	Layout of a tile slice

Operands: ¶

Operand	Description
`base`	memref of any type values
`mask`	a vector type that matches the size of a SVE predicate
`tile`	a vector type that fits into a SME tile
`indices`	variadic of index
`tile_slice_index`	index

Results: ¶

Result	Description
`result`	a vector type that fits into a SME tile

`arm_sme.outerproduct` (arm_sme::OuterProductOp) ¶

Outer product with optional fused add/sub

Syntax:

operation ::= `arm_sme.outerproduct` $lhs `,` $rhs
              oilist(
              `kind` `` $kind
              | `acc` `` `(` $acc `)`
              | `masks` `` `(` $lhsMask `,` $rhsMask `)`
              ) attr-dict `:` type($lhs) `,` type($rhs)

This operation represents an outer product that fits within an SME tile. All operands must be SVE vectors and the result a SME tile. Unlike vector.outerproduct masking is on the operands (rather than the result), which mirrors the SME instructions.

Example 1: Unmasked outerproduct (without accumulator)

// Not specifying an accumulator implicitly zeros the destination tile.
%result = arm_sme.outerproduct $lhs, $rhs : vector<[4]xf32>, vector<[4]xf32>

Example 2: Unmasked outerproduct (with accumulator)

%result = arm_sme.outerproduct $lhs, $rhs acc($accumulator)
            : vector<[4]xf32>, vector<[4]xf32>

Example 3: Masked outerproduct

%result = arm_sme.outerproduct $lhs, $rhs masks($lhsMask, $rhsMask)
            : vector<[4]xf32>, vector<[4]xf32>

Example 4: Masked outerproduct (with accumulator)

%result = arm_sme.outerproduct $lhs, $rhs acc($accumulator) masks($lhsMask, $rhsMask)
            : vector<[4]xf32>, vector<[4]xf32>

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute	MLIR Type	Description
`kind`	::mlir::arm_sme::CombiningKindAttr	Kind of combining function

Operands: ¶

Operand	Description
`lhs`	a vector type that matches the size of a SVE vector
`rhs`	a vector type that matches the size of a SVE vector
`lhsMask`	a vector type that matches the size of a SVE predicate
`rhsMask`	a vector type that matches the size of a SVE predicate
`acc`	a vector type that fits into a SME tile

Results: ¶

Result	Description
`result`	a vector type that fits into a SME tile

`arm_sme.smopa_2way` (arm_sme::SMopa2WayOp) ¶

Signed integer sum of 2 outer products and accumulate

Syntax:

operation ::= `arm_sme.smopa_2way` $lhs `,` $rhs
              oilist(
              `acc` `` `(` $acc `)`
              | `masks` `` `(` $lhsMask `,` $rhsMask `)`
              ) attr-dict `:` type($lhs) `,` type($rhs) `into` type($result)

Example:

%result = arm_sme.smopa_2way $lhs, $rhs : vector<[8]xi16>, vector<[8]xi16> into vector<[4]x[4]xi32>

Refer to fmopa_2way for a detailed description of 2-way outer products.

Spec	Features
SMOPA (2-way)	+sme2

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: ¶

Operand	Description
`lhs`	of ranks 1scalable vector of 16-bit signless integer values of length 8
`rhs`	vector of any type values
`lhsMask`	vector of any type values
`rhsMask`	vector of any type values
`acc`	vector of any type values

Results: ¶

Result	Description
`result`	vector<[4]x[4]xi32> of 32-bit signless integer values

`arm_sme.smopa_4way` (arm_sme::SMopa4WayOp) ¶

Signed integer sum of 4 outer products and accumulate

Syntax:

operation ::= `arm_sme.smopa_4way` $lhs `,` $rhs
              oilist(
              `acc` `` `(` $acc `)`
              | `masks` `` `(` $lhsMask `,` $rhsMask `)`
              ) attr-dict `:` type($lhs) `,` type($rhs) `into` type($result)

This operation represents a sum of 4 widened outer products. It takes 2 1-D scalable vectors as input and a 2-D scalable vector (ZA tile) as output.

For example (i8 to i32):

%result = arm_sme.smopa_4way $lhs, $rhs :
  vector<[16]xi8>, vector<[16]xi8> into vector<[4]x[4]xi32>

The lhs encodes a matrix of shape SVLSx4 and the rhs a matrix of 4xSVLS, where SVLS (spec [1], section B2.1) is the number of 32-bit elements in a vector of SVL bits. To illustrate, below is a breakdown of this operation for i8 to i32, SVL=128 (i.e., vscale=1):

                                    LHS
          [A0 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 A15 A14 A15]

                                    RHS
          [B0 B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 B11 B12 B13 B14 B15]

----------------------------------------------------------------------------

                              implicit layout

                [A0   A1  A2  A3]    |    [B0 B4  B8 B12]
                [A4   A5  A6  A7]    |    [B1 B5  B9 B13]
                [A8   A9 A10 A11]    |    [B2 B6 B10 B14]
                [A12 A13 A14 A15]    |    [B3 B7 B11 B15]

----------------------------------------------------------------------------

                              4 outer products

             Acol0 ⊗ Brow0           |            Acol1 ⊗ Brow1
             -------------           |            -------------
                                     |
         [B0 B4 B8 B12]              |        [B1 B5 B9 B13]
                                     |
   [A0   [ A0B0  A0B4  A0B8  A0B12]  |  [A1   [ A1B1  A1B5  A1B9  A1B13]
    A4   [ A4B0  A4B4  A4B8  A4B12]  |   A5   [ A5B1  A5B5  A5B9  A5B13]
    A8   [ A8B0  A8B4  A8B8  A8B12]  |   A9   [ A9B1  A9B5  A9B9  A9B13]
    A12] [A12B0 A12B4 A12B8 A12B12]  |   A13] [A13B1 A13B5 A13B9 A13B13]
                                     |
             Acol2 ⊗ Brow2           |            Acol3 ⊗ Brow3
             -------------           |            -------------
                                     |
         [B2, B6, B10, B14]          |        [B3 B7 B11 B15]
                                     |
   [A2   [ A2B2  A2B6  A2B10  A2B14] |  [A3   [ A3B3  A3B7  A3B11  A3B15]
    A6   [ A6B2  A6B6  A6B10  A6B14] |   A7   [ A7B3  A7B7  A7B11  A7B15]
    A10  [A10B2 A10B6 A10B10 A10B14] |   A11  [A11B3 A11B7 A11B11 A11B15]
    A14] [A14B2 A14B6 A14B10 A14B14] |   A15] [A15B3 A15B7 A15B11 A15B15]
                                     |

----------------------------------------------------------------------------

                          sum of 4 outer products

       Acol0 ⊗ Brow0 + Acol1 ⊗ Brow1 + Acol2 ⊗ Brow2 + Acol3 ⊗ Brow3

 [ A0B0 +  A1B1 +  A2B2 +  A3B3 ... ...  A0B12 +  A1B13 +  A2B14 +  A3B15]
 [ A4B0 +  A5B1 +  A6B2 +  A7B3 ... ...  A4B12 +  A5B13 +  A6B14 +  A7B15]
 [ A8B0 +  A9B1 + A10B2 + A11B3 ... ...  A8B12 +  A9B13 + A10B14 + A11B15]
 [A12B0 + A13B1 + A14B2 + A15B3 ... ... A12B12 + A13B13 + A14B14 + A15B15]

----------------------------------------------------------------------------

This operation enables the folding of 4 outer products chained via the accumulator into a single outer product.

For example:

%a0_ext = arith.extsi %a0 : vector<[4]xi8> to vector<[4]xi32>
%b0_ext = arith.extsi %b0 : vector<[4]xi8> to vector<[4]xi32>

%a1_ext = arith.extsi %a1 : vector<[4]xi8> to vector<[4]xi32>
%b1_ext = arith.extsi %b1 : vector<[4]xi8> to vector<[4]xi32>

%a2_ext = arith.extsi %a2 : vector<[4]xi8> to vector<[4]xi32>
%b2_ext = arith.extsi %b2 : vector<[4]xi8> to vector<[4]xi32>

%a3_ext = arith.extsi %a3 : vector<[4]xi8> to vector<[4]xi32>
%b3_ext = arith.extsi %b3 : vector<[4]xi8> to vector<[4]xi32>

%0 = arm_sme.outerproduct %a0_ext, %b0_ext : vector<[4]xi32>, vector<[4]xi32>
%1 = arm_sme.outerproduct %a1_ext, %b1_ext acc(%0) : vector<[4]xi32>, vector<[4]xi32>
%2 = arm_sme.outerproduct %a2_ext, %b2_ext acc(%1) : vector<[4]xi32>, vector<[4]xi32>
%3 = arm_sme.outerproduct %a3_ext, %b3_ext acc(%2) : vector<[4]xi32>, vector<[4]xi32>

The 4 outer products in the example above can be fused into a single outer product as follows:

%lhs0 = vector.interleave %a0, %a2 : vector<[4]xi8> -> vector<[8]xi8>
%lhs1 = vector.interleave %a1, %a3 : vector<[4]xi8> -> vector<[8]xi8>
%lhs = vector.interleave %lhs0, %lhs1 : vector<[8]xi8> -> vector<[16]xi8>

%rhs0 = vector.interleave %b0, %b2 : vector<[4]xi8> -> vector<[8]xi8>
%rhs1 = vector.interleave %b1, %b3 : vector<[4]xi8> -> vector<[8]xi8>
%rhs = vector.interleave %rhs0, %rhs1 : vector<[8]xi8> -> vector<[16]xi8>

%0 = arm_sme.smopa_4way %lhs, %rhs : vector<[16]xi8>, vector<[16]xi8> into vector<[4]x[4]xi32>

This is implemented in the -arm-sme-outer-product-fusion pass.

Example: I8 to I32

%result = arm_sme.smopa_4way $lhs, $rhs : vector<[16]xi8>, vector<[16]xi8> into vector<[4]x[4]xi32>

Example: I16 to I64

%result = arm_sme.smopa_4way $lhs, $rhs : vector<[8]xi16>, vector<[8]xi16> into vector<[2]x[2]xi64>

Spec	Features
SMOPA (4-way)	+sme (32-bit), +sme-i16i64 (64-bit)

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: ¶

Operand	Description
`lhs`	of ranks 1scalable vector of 8-bit signless integer values of length 16 or of ranks 1scalable vector of 16-bit signless integer values of length 8
`rhs`	vector of any type values
`lhsMask`	vector of any type values
`rhsMask`	vector of any type values
`acc`	vector of any type values

Results: ¶

Result	Description
`result`	vector<[4]x[4]xi32> of 32-bit signless integer values or vector<[2]x[2]xi64> of 64-bit signless integer values

`arm_sme.smops_2way` (arm_sme::SMops2WayOp) ¶

Signed integer sum of 2 outer products and subtract

Syntax:

operation ::= `arm_sme.smops_2way` $lhs `,` $rhs
              oilist(
              `acc` `` `(` $acc `)`
              | `masks` `` `(` $lhsMask `,` $rhsMask `)`
              ) attr-dict `:` type($lhs) `,` type($rhs) `into` type($result)

Example:

%result = arm_sme.smops_2way $lhs, $rhs : vector<[8]xi16>, vector<[8]xi16> into vector<[4]x[4]xi32>

Refer to fmopa_2way for a detailed description of 2-way outer products.

Spec	Features
SMOPS (2-way)	+sme2

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: ¶

Operand	Description
`lhs`	of ranks 1scalable vector of 16-bit signless integer values of length 8
`rhs`	vector of any type values
`lhsMask`	vector of any type values
`rhsMask`	vector of any type values
`acc`	vector of any type values

Results: ¶

Result	Description
`result`	vector<[4]x[4]xi32> of 32-bit signless integer values

`arm_sme.smops_4way` (arm_sme::SMops4WayOp) ¶

Signed integer sum of 4 outer products and subtract

Syntax:

operation ::= `arm_sme.smops_4way` $lhs `,` $rhs
              oilist(
              `acc` `` `(` $acc `)`
              | `masks` `` `(` $lhsMask `,` $rhsMask `)`
              ) attr-dict `:` type($lhs) `,` type($rhs) `into` type($result)

Equivalent to smopa_4way but outer products are subtracted from destination result.

Example: I8 to I32

%result = arm_sme.smops_4way $lhs, $rhs : vector<[16]xi8>, vector<[16]xi8> into vector<[4]x[4]xi32>

Example: I16 to I64

%result = arm_sme.smops_4way $lhs, $rhs : vector<[8]xi16>, vector<[8]xi16> into vector<[2]x[2]xi64>

Refer to smopa_4way for a detailed description of 4-way outer products.

Spec	Features
SMOPS (4-way)	+sme (32-bit), +sme-i16i64 (64-bit)

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: ¶

Operand	Description
`lhs`	of ranks 1scalable vector of 8-bit signless integer values of length 16 or of ranks 1scalable vector of 16-bit signless integer values of length 8
`rhs`	vector of any type values
`lhsMask`	vector of any type values
`rhsMask`	vector of any type values
`acc`	vector of any type values

Results: ¶

Result	Description
`result`	vector<[4]x[4]xi32> of 32-bit signless integer values or vector<[2]x[2]xi64> of 64-bit signless integer values

`arm_sme.store_tile_slice` (arm_sme::StoreTileSliceOp) ¶

Tile slice store operation

Syntax:

operation ::= `arm_sme.store_tile_slice` $tile `,` $tile_slice_index `,` $mask `,` $base `[` $indices `]` (`layout` `` $layout^)?
              attr-dict `:` type($base) `,` type($mask) `,` type($tile)

Stores a 1D tile slice from a 2D SME “virtual tile” into memory. The tile slice is defined by the dimension of the 2D scalable vector type pointed by the index. A tile slice index describes where in the input tile the tile slice is stored from. An optional tile slice layout attribute specifies whether the tile slice being stored from the given index is horizontal (default) or vertical.

The slice of memory written is defined by a base and indices and must be contiguous. The memref must be either rank 1 or rank 2, have dynamic dimensions since the operation is scalable, and the element type must be a scalar that matches the element type of the input tile.

The provided mask is used to specify which elements of the tile slice will be stored.

Example 1: Store vector<[16]xi8> horizontal (default) tile slice from tile at given index to memory.

arm_sme.store_tile_slice %tile, %tile_slice_index, %mask, %base[%c0] : vector<[16]x[16]xi8>, vector<[16]xi1>, memref<?x?xi8>

Example 2: Store vector<[4]xf32> vertical tile slice from tile at given index to memory.

arm_sme.store_tile_slice %tile, %tile_slice_index, %mask, %base[%c0] layout<vertical> : vector<[4]x[4]xf32>, vector<[4]xi1>, memref<?x?xf32>

Example 3: Store a vector<[1]xi128> vertical tile slice from tile at given index to memory.

arm_sme.store_tile_slice %tile, %tile_slice_index, %mask, %base[%c0] layout<vertical> : vector<[1]x[1]xi128>, vector<[1]xi1>, memref<?x?xi128>

Interfaces: ArmSMETileOpInterface

Attributes: ¶

Attribute	MLIR Type	Description
`layout`	::mlir::arm_sme::TileSliceLayoutAttr	Layout of a tile slice

Operands: ¶

Operand	Description
`tile`	a vector type that fits into a SME tile
`tile_slice_index`	index
`mask`	a vector type that matches the size of a SVE predicate
`base`	memref of any type values
`indices`	variadic of index

`arm_sme.streaming_vl` (arm_sme::StreamingVLOp) ¶

Query the streaming vector length

Syntax:

operation ::= `arm_sme.streaming_vl` $type_size attr-dict

This operation returns the streaming vector length (SVL) for a given type size. Unlike vector.vscale the value returned is invariant to the streaming mode.

Example:

// Streaming vector length in:
// - bytes (8-bit, SVL.B)
%svl_b = arm_sme.streaming_vl <byte>
// - half words (16-bit, SVL.H)
%svl_h = arm_sme.streaming_vl <half>
// - words (32-bit, SVL.W)
%svl_w = arm_sme.streaming_vl <word>
// - double words (64-bit, SVL.D)
%svl_d = arm_sme.streaming_vl <double>

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes: ¶

Attribute	MLIR Type	Description
`type_size`	::mlir::arm_sme::TypeSizeAttr	Size of a vector element type

Results: ¶

Result	Description
«unnamed»	index

`arm_sme.sumopa_4way` (arm_sme::SuMopa4WayOp) ¶

Signed by unsigned integer sum of 4 outer products and accumulate

Syntax:

operation ::= `arm_sme.sumopa_4way` $lhs `,` $rhs
              oilist(
              `acc` `` `(` $acc `)`
              | `masks` `` `(` $lhsMask `,` $rhsMask `)`
              ) attr-dict `:` type($lhs) `,` type($rhs) `into` type($result)

Example: I8 to I32

%result = arm_sme.sumopa_4way $lhs, $rhs : vector<[16]xi8>, vector<[16]xi8> into vector<[4]x[4]xi32>

Example: I16 to I64

%result = arm_sme.sumopa_4way $lhs, $rhs : vector<[8]xi16>, vector<[8]xi16> into vector<[2]x[2]xi64>

Refer to smopa_4way for a detailed description of 4-way outer products.

Spec	Features
SUMOPA (4-way)	+sme (32-bit), +sme-i16i64 (64-bit)

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: ¶

Operand	Description
`lhs`	of ranks 1scalable vector of 8-bit signless integer values of length 16 or of ranks 1scalable vector of 16-bit signless integer values of length 8
`rhs`	vector of any type values
`lhsMask`	vector of any type values
`rhsMask`	vector of any type values
`acc`	vector of any type values

Results: ¶

Result	Description
`result`	vector<[4]x[4]xi32> of 32-bit signless integer values or vector<[2]x[2]xi64> of 64-bit signless integer values

`arm_sme.sumops_4way` (arm_sme::SuMops4WayOp) ¶

Signed by unsigned integer sum of 4 outer products and subtract

Syntax:

operation ::= `arm_sme.sumops_4way` $lhs `,` $rhs
              oilist(
              `acc` `` `(` $acc `)`
              | `masks` `` `(` $lhsMask `,` $rhsMask `)`
              ) attr-dict `:` type($lhs) `,` type($rhs) `into` type($result)

Example: I8 to I32

%result = arm_sme.sumops_4way $lhs, $rhs : vector<[16]xi8>, vector<[16]xi8> into vector<[4]x[4]xi32>

Example: I16 to I64

%result = arm_sme.sumops_4way $lhs, $rhs : vector<[8]xi16>, vector<[8]xi16> into vector<[2]x[2]xi64>

Refer to smopa_4way for a detailed description of 4-way outer products.

Spec	Features
SUMOPS (4-way)	+sme (32-bit), +sme-i16i64 (64-bit)

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: ¶

Operand	Description
`lhs`	of ranks 1scalable vector of 8-bit signless integer values of length 16 or of ranks 1scalable vector of 16-bit signless integer values of length 8
`rhs`	vector of any type values
`lhsMask`	vector of any type values
`rhsMask`	vector of any type values
`acc`	vector of any type values

Results: ¶

Result	Description
`result`	vector<[4]x[4]xi32> of 32-bit signless integer values or vector<[2]x[2]xi64> of 64-bit signless integer values

`arm_sme.tile_load` (arm_sme::TileLoadOp) ¶

Tile load operation

Syntax:

operation ::= `arm_sme.tile_load` $base `[` $indices `]` (`,` $padding `,` $mask^)? (`layout` `` $layout^)?attr-dict `:` type($base) `,` type($result)

Loads a 2D SME “virtual tile” from memory defined by a base and indices, with the shape defined by the 2D scalable vector type of the result tile. An optional tile slice layout attribute specifies whether the slices of the tile being loaded are horizontal (default) or vertical. The slice of memory must be contiguous. The memref must be either rank 1 or rank 2 with dynamic dimensions, since the operation is scalable, and the element type must be a scalar that matches the element type of the result.

An optional SSA value padding of the same elemental type as the MemRef is provided to specify a fallback value in the case of masking.

An optional SSA value mask may be specified to mask out elements read from the MemRef. The mask type is an i1 vector with a shape that matches how elements are read from the MemRef. Elements whose corresponding mask element is 0 are masked out and replaced with padding.

If either padding or mask are specified, both must be specified.

Example 1: Load an 8-bit element ZA tile with horizontal layout (default) from memory (ZA0.B).

%tile = arm_sme.tile_load %base[%c0, %c0] : memref<?x?xi8>, vector<[16]x[16]xi8>

Example 2: Load a FP 32-bit element ZA tile with vertical layout from memory.

%tile = arm_sme.tile_load %base[%c0, %c0] layout<vertical> : memref<?x?xf32>, vector<[4]x[4]xf32>

Example 3: Load a 128-bit element ZA tile with horizontal layout (default) from memory.

%tile = arm_sme.tile_load %base[%c0, %c0] layout<horizontal> : memref<?x?xi128>, vector<[1]x[1]xi128>

Example 4: Masked load of int 32-bit element ZA tile with horizontal layout (default) from memory.

%tile = arm_sme.tile_load %base[%c0, %c0], %pad, %mask : memref<?x?xf32>, vector<[4]x[4]xf32>

Traits: AttrSizedOperandSegments

Interfaces: ArmSMETileOpInterface

Attributes: ¶

Attribute	MLIR Type	Description
`layout`	::mlir::arm_sme::TileSliceLayoutAttr	Layout of a tile slice

Operands: ¶

Operand	Description
`base`	2D memref of any type values
`indices`	variadic of index
`padding`	any type
`mask`	vector of any type values

Results: ¶

Result	Description
`result`	a vector type that fits into a SME tile

`arm_sme.tile_store` (arm_sme::TileStoreOp) ¶

Tile store operation

Syntax:

operation ::= `arm_sme.tile_store` $valueToStore `,` $base `[` $indices `]` (`,` $mask^)? (`layout` `` $layout^)?attr-dict `:` type($base) `,` type($valueToStore)

Stores a 2D SME “virtual tile” to memory defined by a base and indices, with the shape defined by the 2D scalable vector type of the tile being stored. An optional tile slice layout attribute specifies whether the slices of the tile being stored are horizontal (default) or vertical. The slice of memory must be contiguous. The memref must be either rank 1 or rank 2 with dynamic dimensions, since the operation is scalable, and the element type must be a scalar that matches the element type of the result.

An optional mask may be provided, the shape of which corresponds to the tile, and selects which elements of the tile will be stored.

Example 1: Store an 8-bit element ZA tile with horizontal (default) layout to memory (ZA0.B).

arm_sme.tile_store %tile, %base[%c0, %c0] : vector<[16]x[16]xi8>, memref<?x?xi8>

Example 2: Store a FP 32-bit element ZA tile with vertical layout to memory.

arm_sme.tile_store %tile, %base[%c0, %c0] layout<vertical> : vector<[4]x[4]xf32>, memref<?x?xf32>

Example 3: Store a 128-bit element ZA tile with horizontal (default) layout to memory.

arm_sme.tile_store %tile, %base[%c0, %c0] layout<horizontal> : vector<[1]x[1]xi128>, memref<?x?xi128>

Example 4: Masked store a int 32-bit element ZA tile with vertical layout to memory.

arm_sme.tile_store %tile, %base[%c0, %c0], %mask layout<vertical> : vector<[4]x[4]xf32>, memref<?x?xf32>

Traits: AttrSizedOperandSegments

Interfaces: ArmSMETileOpInterface

Attributes: ¶

Attribute	MLIR Type	Description
`layout`	::mlir::arm_sme::TileSliceLayoutAttr	Layout of a tile slice

Operands: ¶

Operand	Description
`valueToStore`	a vector type that fits into a SME tile
`base`	2D memref of any type values
`indices`	variadic of index
`mask`	vector of any type values

`arm_sme.umopa_2way` (arm_sme::UMopa2WayOp) ¶

Unsiged integer sum of 2 outer products and accumulate

Syntax:

operation ::= `arm_sme.umopa_2way` $lhs `,` $rhs
              oilist(
              `acc` `` `(` $acc `)`
              | `masks` `` `(` $lhsMask `,` $rhsMask `)`
              ) attr-dict `:` type($lhs) `,` type($rhs) `into` type($result)

Example:

%result = arm_sme.umopa_2way $lhs, $rhs : vector<[8]xi16>, vector<[8]xi16> into vector<[4]x[4]xi32>

Refer to fmopa_2way for a detailed description of 2-way outer products.

Spec	Features
UMOPA (2-way)	+sme2

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: ¶

Operand	Description
`lhs`	of ranks 1scalable vector of 16-bit signless integer values of length 8
`rhs`	vector of any type values
`lhsMask`	vector of any type values
`rhsMask`	vector of any type values
`acc`	vector of any type values

Results: ¶

Result	Description
`result`	vector<[4]x[4]xi32> of 32-bit signless integer values

`arm_sme.umopa_4way` (arm_sme::UMopa4WayOp) ¶

Unsigned integer sum of 4 outer products and accumulate

Syntax:

operation ::= `arm_sme.umopa_4way` $lhs `,` $rhs
              oilist(
              `acc` `` `(` $acc `)`
              | `masks` `` `(` $lhsMask `,` $rhsMask `)`
              ) attr-dict `:` type($lhs) `,` type($rhs) `into` type($result)

Example: I8 to I32

%result = arm_sme.umopa_4way $lhs, $rhs : vector<[16]xi8>, vector<[16]xi8> into vector<[4]x[4]xi32>

Example: I16 to I64

%result = arm_sme.umopa_4way $lhs, $rhs : vector<[8]xi16>, vector<[8]xi16> into vector<[2]x[2]xi64>

Refer to smopa_4way for a detailed description of 4-way outer products.

Spec	Features
UMOPA (4-way)	+sme (32-bit), +sme-i16i64 (64-bit)

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: ¶

Operand	Description
`lhs`	of ranks 1scalable vector of 8-bit signless integer values of length 16 or of ranks 1scalable vector of 16-bit signless integer values of length 8
`rhs`	vector of any type values
`lhsMask`	vector of any type values
`rhsMask`	vector of any type values
`acc`	vector of any type values

Results: ¶

Result	Description
`result`	vector<[4]x[4]xi32> of 32-bit signless integer values or vector<[2]x[2]xi64> of 64-bit signless integer values

`arm_sme.umops_2way` (arm_sme::UMops2WayOp) ¶

Unsiged integer sum of 2 outer products and subtract

Syntax:

operation ::= `arm_sme.umops_2way` $lhs `,` $rhs
              oilist(
              `acc` `` `(` $acc `)`
              | `masks` `` `(` $lhsMask `,` $rhsMask `)`
              ) attr-dict `:` type($lhs) `,` type($rhs) `into` type($result)

Example:

%result = arm_sme.umops_2way $lhs, $rhs : vector<[8]xi16>, vector<[8]xi16> into vector<[4]x[4]xi32>

Refer to fmopa_2way for a detailed description of 2-way outer products.

Spec	Features
UMOPS (2-way)	+sme2

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: ¶

Operand	Description
`lhs`	of ranks 1scalable vector of 16-bit signless integer values of length 8
`rhs`	vector of any type values
`lhsMask`	vector of any type values
`rhsMask`	vector of any type values
`acc`	vector of any type values

Results: ¶

Result	Description
`result`	vector<[4]x[4]xi32> of 32-bit signless integer values

`arm_sme.umops_4way` (arm_sme::UMops4WayOp) ¶

Unsigned integer sum of 4 outer products and subtract

Syntax:

operation ::= `arm_sme.umops_4way` $lhs `,` $rhs
              oilist(
              `acc` `` `(` $acc `)`
              | `masks` `` `(` $lhsMask `,` $rhsMask `)`
              ) attr-dict `:` type($lhs) `,` type($rhs) `into` type($result)

Example: I8 to I32

%result = arm_sme.umops_4way $lhs, $rhs : vector<[16]xi8>, vector<[16]xi8> into vector<[4]x[4]xi32>

Example: I16 to I64

%result = arm_sme.umops_4way $lhs, $rhs : vector<[8]xi16>, vector<[8]xi16> into vector<[2]x[2]xi64>

Refer to smopa_4way for a detailed description of 4-way outer products.

Spec	Features
UMOPS (4-way)	+sme (32-bit), +sme-i16i64 (64-bit)

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: ¶

Operand	Description
`lhs`	of ranks 1scalable vector of 8-bit signless integer values of length 16 or of ranks 1scalable vector of 16-bit signless integer values of length 8
`rhs`	vector of any type values
`lhsMask`	vector of any type values
`rhsMask`	vector of any type values
`acc`	vector of any type values

Results: ¶

Result	Description
`result`	vector<[4]x[4]xi32> of 32-bit signless integer values or vector<[2]x[2]xi64> of 64-bit signless integer values

`arm_sme.usmopa_4way` (arm_sme::UsMopa4WayOp) ¶

Unsigned by signed integer sum of 4 outer products and accumulate

Syntax:

operation ::= `arm_sme.usmopa_4way` $lhs `,` $rhs
              oilist(
              `acc` `` `(` $acc `)`
              | `masks` `` `(` $lhsMask `,` $rhsMask `)`
              ) attr-dict `:` type($lhs) `,` type($rhs) `into` type($result)

Example: I8 to I32

%result = arm_sme.usmopa_4way $lhs, $rhs : vector<[16]xi8>, vector<[16]xi8> into vector<[4]x[4]xi32>

Example: I16 to I64

%result = arm_sme.usmopa_4way $lhs, $rhs : vector<[8]xi16>, vector<[8]xi16> into vector<[2]x[2]xi64>

Refer to smopa_4way for a detailed description of 4-way outer products.

Spec	Features
USMOPA (4-way)	+sme (32-bit), +sme-i16i64 (64-bit)

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: ¶

Operand	Description
`lhs`	of ranks 1scalable vector of 8-bit signless integer values of length 16 or of ranks 1scalable vector of 16-bit signless integer values of length 8
`rhs`	vector of any type values
`lhsMask`	vector of any type values
`rhsMask`	vector of any type values
`acc`	vector of any type values

Results: ¶

Result	Description
`result`	vector<[4]x[4]xi32> of 32-bit signless integer values or vector<[2]x[2]xi64> of 64-bit signless integer values

`arm_sme.usmops_4way` (arm_sme::UsMops4WayOp) ¶

Unsigned by signed integer sum of 4 outer products and subtract

Syntax:

operation ::= `arm_sme.usmops_4way` $lhs `,` $rhs
              oilist(
              `acc` `` `(` $acc `)`
              | `masks` `` `(` $lhsMask `,` $rhsMask `)`
              ) attr-dict `:` type($lhs) `,` type($rhs) `into` type($result)

Example: I8 to I32

%result = arm_sme.usmops_4way $lhs, $rhs : vector<[16]xi8>, vector<[16]xi8> into vector<[4]x[4]xi32>

Example: I16 to I64

%result = arm_sme.usmops_4way $lhs, $rhs : vector<[8]xi16>, vector<[8]xi16> into vector<[2]x[2]xi64>

Refer to smopa_4way for a detailed description of 4-way outer products.

Spec	Features
USMOPS (4-way)	+sme (32-bit), +sme-i16i64 (64-bit)

Traits: AlwaysSpeculatableImplTrait, AttrSizedOperandSegments

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: ¶

Operand	Description
`lhs`	of ranks 1scalable vector of 8-bit signless integer values of length 16 or of ranks 1scalable vector of 16-bit signless integer values of length 8
`rhs`	vector of any type values
`lhsMask`	vector of any type values
`rhsMask`	vector of any type values
`acc`	vector of any type values

Results: ¶

Result	Description
`result`	vector<[4]x[4]xi32> of 32-bit signless integer values or vector<[2]x[2]xi64> of 64-bit signless integer values

`arm_sme.zero` (arm_sme::ZeroOp) ¶

Creates a zero-initialized value of SME virtual tile type

Syntax:

operation ::= `arm_sme.zero` attr-dict `:` type($res)

Creates a new SME “virtual tile” value within a function. The contents of the tile returned from this operation are zero-initialized.

Example 1: Zero an 8-bit element ZA tile.

%0 = arm_sme.zero : vector<[16]x[16]xi8>

Example 2: Zero a 64-bit element ZA tile.

%0 = arm_sme.zero : vector<[2]x[2]xi64>

Traits: AlwaysSpeculatableImplTrait

Interfaces: ArmSMETileOpInterface, ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results: ¶

Result	Description
`res`	a vector type that fits into a SME tile

Operations for LLVM IR Intrinsics ¶

source

`arm_sme.intr.cntsd` (arm_sme::aarch64_sme_cntsd) ¶

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`arm_sme.intr.ld1b.horiz` (arm_sme::aarch64_sme_ld1b_horiz) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`load_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.ld1b.vert` (arm_sme::aarch64_sme_ld1b_vert) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`load_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.ld1d.horiz` (arm_sme::aarch64_sme_ld1d_horiz) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`load_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.ld1d.vert` (arm_sme::aarch64_sme_ld1d_vert) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`load_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.ld1h.horiz` (arm_sme::aarch64_sme_ld1h_horiz) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`load_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.ld1h.vert` (arm_sme::aarch64_sme_ld1h_vert) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`load_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.ld1q.horiz` (arm_sme::aarch64_sme_ld1q_horiz) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`load_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.ld1q.vert` (arm_sme::aarch64_sme_ld1q_vert) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`load_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.ld1w.horiz` (arm_sme::aarch64_sme_ld1w_horiz) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`load_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.ld1w.vert` (arm_sme::aarch64_sme_ld1w_vert) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`load_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.mopa` (arm_sme::aarch64_sme_mopa) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`lhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`rhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`lhs_vector`	a vector type that is a supported input for the SME MOP instructions
`rhs_vector`	a vector type that is a supported input for the SME MOP instructions

`arm_sme.intr.mopa.wide` (arm_sme::aarch64_sme_mopa_wide) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`lhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`rhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`lhs_vector`	a vector type that is a supported input for the SME MOP instructions
`rhs_vector`	a vector type that is a supported input for the SME MOP instructions

`arm_sme.intr.mops` (arm_sme::aarch64_sme_mops) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`lhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`rhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`lhs_vector`	a vector type that is a supported input for the SME MOP instructions
`rhs_vector`	a vector type that is a supported input for the SME MOP instructions

`arm_sme.intr.mops.wide` (arm_sme::aarch64_sme_mops_wide) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`lhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`rhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`lhs_vector`	a vector type that is a supported input for the SME MOP instructions
`rhs_vector`	a vector type that is a supported input for the SME MOP instructions

`arm_sme.intr.read.horiz` (arm_sme::aarch64_sme_read_horiz) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`vector`	a vector type that matches the size of a SVE vector
`predicate`	a vector type that matches the size of a SVE predicate
`tile_slice_index`	32-bit signless integer

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`arm_sme.intr.read.vert` (arm_sme::aarch64_sme_read_vert) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`vector`	a vector type that matches the size of a SVE vector
`predicate`	a vector type that matches the size of a SVE predicate
`tile_slice_index`	32-bit signless integer

Results: ¶

Result	Description
`res`	LLVM dialect-compatible type

`arm_sme.intr.smopa.wide` (arm_sme::aarch64_sme_smopa_wide) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`lhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`rhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`lhs_vector`	a vector type that is a supported input for the SME MOP instructions
`rhs_vector`	a vector type that is a supported input for the SME MOP instructions

`arm_sme.intr.smopa.za32` (arm_sme::aarch64_sme_smopa_za32) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`lhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`rhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`lhs_vector`	a vector type that is a supported input for the SME MOP instructions
`rhs_vector`	a vector type that is a supported input for the SME MOP instructions

`arm_sme.intr.smops.wide` (arm_sme::aarch64_sme_smops_wide) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`lhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`rhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`lhs_vector`	a vector type that is a supported input for the SME MOP instructions
`rhs_vector`	a vector type that is a supported input for the SME MOP instructions

`arm_sme.intr.smops.za32` (arm_sme::aarch64_sme_smops_za32) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`lhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`rhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`lhs_vector`	a vector type that is a supported input for the SME MOP instructions
`rhs_vector`	a vector type that is a supported input for the SME MOP instructions

`arm_sme.intr.st1b.horiz` (arm_sme::aarch64_sme_st1b_horiz) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`store_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.st1b.vert` (arm_sme::aarch64_sme_st1b_vert) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`store_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.st1d.horiz` (arm_sme::aarch64_sme_st1d_horiz) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`store_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.st1d.vert` (arm_sme::aarch64_sme_st1d_vert) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`store_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.st1h.horiz` (arm_sme::aarch64_sme_st1h_horiz) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`store_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.st1h.vert` (arm_sme::aarch64_sme_st1h_vert) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`store_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.st1q.horiz` (arm_sme::aarch64_sme_st1q_horiz) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`store_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.st1q.vert` (arm_sme::aarch64_sme_st1q_vert) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`store_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.st1w.horiz` (arm_sme::aarch64_sme_st1w_horiz) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`store_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.st1w.vert` (arm_sme::aarch64_sme_st1w_vert) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`predicate`	a vector type that matches the size of a SVE predicate
`store_address`	LLVM pointer type
`tile_slice_index`	32-bit signless integer

`arm_sme.intr.str` (arm_sme::aarch64_sme_str) ¶

Operands: ¶

Operand	Description
`index`	32-bit signless integer
`store_address`	LLVM pointer type
`offset`	32-bit signless integer

`arm_sme.intr.sumopa.wide` (arm_sme::aarch64_sme_sumopa_wide) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`lhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`rhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`lhs_vector`	a vector type that is a supported input for the SME MOP instructions
`rhs_vector`	a vector type that is a supported input for the SME MOP instructions

`arm_sme.intr.sumops.wide` (arm_sme::aarch64_sme_sumops_wide) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`lhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`rhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`lhs_vector`	a vector type that is a supported input for the SME MOP instructions
`rhs_vector`	a vector type that is a supported input for the SME MOP instructions

`arm_sme.intr.umopa.wide` (arm_sme::aarch64_sme_umopa_wide) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`lhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`rhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`lhs_vector`	a vector type that is a supported input for the SME MOP instructions
`rhs_vector`	a vector type that is a supported input for the SME MOP instructions

`arm_sme.intr.umopa.za32` (arm_sme::aarch64_sme_umopa_za32) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`lhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`rhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`lhs_vector`	a vector type that is a supported input for the SME MOP instructions
`rhs_vector`	a vector type that is a supported input for the SME MOP instructions

`arm_sme.intr.umops.wide` (arm_sme::aarch64_sme_umops_wide) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`lhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`rhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`lhs_vector`	a vector type that is a supported input for the SME MOP instructions
`rhs_vector`	a vector type that is a supported input for the SME MOP instructions

`arm_sme.intr.umops.za32` (arm_sme::aarch64_sme_umops_za32) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`lhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`rhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`lhs_vector`	a vector type that is a supported input for the SME MOP instructions
`rhs_vector`	a vector type that is a supported input for the SME MOP instructions

`arm_sme.intr.usmopa.wide` (arm_sme::aarch64_sme_usmopa_wide) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`lhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`rhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`lhs_vector`	a vector type that is a supported input for the SME MOP instructions
`rhs_vector`	a vector type that is a supported input for the SME MOP instructions

`arm_sme.intr.usmops.wide` (arm_sme::aarch64_sme_usmops_wide) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`lhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`rhs_predicate`	a vector type that is a supported predicate for the SME MOP instructions
`lhs_vector`	a vector type that is a supported input for the SME MOP instructions
`rhs_vector`	a vector type that is a supported input for the SME MOP instructions

`arm_sme.intr.write.horiz` (arm_sme::aarch64_sme_write_horiz) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`tile_slice_index`	32-bit signless integer
`predicate`	a vector type that matches the size of a SVE predicate
`vector`	a vector type that matches the size of a SVE vector

`arm_sme.intr.write.vert` (arm_sme::aarch64_sme_write_vert) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_id`	::mlir::IntegerAttr	32-bit signless integer attribute

Operands: ¶

Operand	Description
`tile_slice_index`	32-bit signless integer
`predicate`	a vector type that matches the size of a SVE predicate
`vector`	a vector type that matches the size of a SVE vector

`arm_sme.intr.zero` (arm_sme::aarch64_sme_zero) ¶

Attributes: ¶

Attribute	MLIR Type	Description
`tile_mask`	::mlir::IntegerAttr	32-bit signless integer attribute

'ArmSME' Dialect

Operations ¶

arm_sme.copy_tile (arm_sme::CopyTileOp) ¶

Operands: ¶

Results: ¶

arm_sme.extract_tile_slice (arm_sme::ExtractTileSliceOp) ¶

Attributes: ¶

Operands: ¶

Results: ¶

arm_sme.fmopa_2way (arm_sme::FMopa2WayOp) ¶

Operands: ¶

Results: ¶

arm_sme.fmops_2way (arm_sme::FMops2WayOp) ¶

Operands: ¶

Results: ¶

arm_sme.get_tile (arm_sme::GetTileOp) ¶

Results: ¶

arm_sme.insert_tile_slice (arm_sme::InsertTileSliceOp) ¶

Attributes: ¶

Operands: ¶

Results: ¶

arm_sme.load_tile_slice (arm_sme::LoadTileSliceOp) ¶

Attributes: ¶

Operands: ¶

Results: ¶

arm_sme.outerproduct (arm_sme::OuterProductOp) ¶

Attributes: ¶

Operands: ¶

Results: ¶

arm_sme.smopa_2way (arm_sme::SMopa2WayOp) ¶

Operands: ¶

Results: ¶

arm_sme.smopa_4way (arm_sme::SMopa4WayOp) ¶

Operands: ¶

Results: ¶

arm_sme.smops_2way (arm_sme::SMops2WayOp) ¶

Operands: ¶

Results: ¶

arm_sme.smops_4way (arm_sme::SMops4WayOp) ¶

Operands: ¶

Results: ¶

arm_sme.store_tile_slice (arm_sme::StoreTileSliceOp) ¶

Attributes: ¶

Operands: ¶

arm_sme.streaming_vl (arm_sme::StreamingVLOp) ¶

Attributes: ¶

Results: ¶

arm_sme.sumopa_4way (arm_sme::SuMopa4WayOp) ¶

Operands: ¶

Results: ¶

arm_sme.sumops_4way (arm_sme::SuMops4WayOp) ¶

Operands: ¶

Results: ¶

arm_sme.tile_load (arm_sme::TileLoadOp) ¶

Attributes: ¶

Operands: ¶

Results: ¶

arm_sme.tile_store (arm_sme::TileStoreOp) ¶

Attributes: ¶

Operands: ¶

arm_sme.umopa_2way (arm_sme::UMopa2WayOp) ¶

Operands: ¶

Results: ¶

arm_sme.umopa_4way (arm_sme::UMopa4WayOp) ¶

Operands: ¶

Results: ¶

arm_sme.umops_2way (arm_sme::UMops2WayOp) ¶

Operands: ¶

Results: ¶

arm_sme.umops_4way (arm_sme::UMops4WayOp) ¶

Operands: ¶

Results: ¶

arm_sme.usmopa_4way (arm_sme::UsMopa4WayOp) ¶

Operands: ¶

Results: ¶

arm_sme.usmops_4way (arm_sme::UsMops4WayOp) ¶

Operands: ¶

Results: ¶

arm_sme.zero (arm_sme::ZeroOp) ¶

Results: ¶

`arm_sme.copy_tile` (arm_sme::CopyTileOp) ¶

`arm_sme.extract_tile_slice` (arm_sme::ExtractTileSliceOp) ¶

`arm_sme.fmopa_2way` (arm_sme::FMopa2WayOp) ¶

`arm_sme.fmops_2way` (arm_sme::FMops2WayOp) ¶

`arm_sme.get_tile` (arm_sme::GetTileOp) ¶

`arm_sme.insert_tile_slice` (arm_sme::InsertTileSliceOp) ¶

`arm_sme.load_tile_slice` (arm_sme::LoadTileSliceOp) ¶

`arm_sme.outerproduct` (arm_sme::OuterProductOp) ¶

`arm_sme.smopa_2way` (arm_sme::SMopa2WayOp) ¶

`arm_sme.smopa_4way` (arm_sme::SMopa4WayOp) ¶

`arm_sme.smops_2way` (arm_sme::SMops2WayOp) ¶

`arm_sme.smops_4way` (arm_sme::SMops4WayOp) ¶

`arm_sme.store_tile_slice` (arm_sme::StoreTileSliceOp) ¶

`arm_sme.streaming_vl` (arm_sme::StreamingVLOp) ¶

`arm_sme.sumopa_4way` (arm_sme::SuMopa4WayOp) ¶

`arm_sme.sumops_4way` (arm_sme::SuMops4WayOp) ¶

`arm_sme.tile_load` (arm_sme::TileLoadOp) ¶

`arm_sme.tile_store` (arm_sme::TileStoreOp) ¶

`arm_sme.umopa_2way` (arm_sme::UMopa2WayOp) ¶

`arm_sme.umopa_4way` (arm_sme::UMopa4WayOp) ¶

`arm_sme.umops_2way` (arm_sme::UMops2WayOp) ¶

`arm_sme.umops_4way` (arm_sme::UMops4WayOp) ¶

`arm_sme.usmopa_4way` (arm_sme::UsMopa4WayOp) ¶

`arm_sme.usmops_4way` (arm_sme::UsMops4WayOp) ¶

`arm_sme.zero` (arm_sme::ZeroOp) ¶

`arm_sme.intr.cntsd` (arm_sme::aarch64_sme_cntsd) ¶

`arm_sme.intr.ld1b.horiz` (arm_sme::aarch64_sme_ld1b_horiz) ¶

`arm_sme.intr.ld1b.vert` (arm_sme::aarch64_sme_ld1b_vert) ¶

`arm_sme.intr.ld1d.horiz` (arm_sme::aarch64_sme_ld1d_horiz) ¶

`arm_sme.intr.ld1d.vert` (arm_sme::aarch64_sme_ld1d_vert) ¶

`arm_sme.intr.ld1h.horiz` (arm_sme::aarch64_sme_ld1h_horiz) ¶

`arm_sme.intr.ld1h.vert` (arm_sme::aarch64_sme_ld1h_vert) ¶

`arm_sme.intr.ld1q.horiz` (arm_sme::aarch64_sme_ld1q_horiz) ¶

`arm_sme.intr.ld1q.vert` (arm_sme::aarch64_sme_ld1q_vert) ¶

`arm_sme.intr.ld1w.horiz` (arm_sme::aarch64_sme_ld1w_horiz) ¶

`arm_sme.intr.ld1w.vert` (arm_sme::aarch64_sme_ld1w_vert) ¶

`arm_sme.intr.mopa` (arm_sme::aarch64_sme_mopa) ¶

`arm_sme.intr.mopa.wide` (arm_sme::aarch64_sme_mopa_wide) ¶

`arm_sme.intr.mops` (arm_sme::aarch64_sme_mops) ¶

`arm_sme.intr.mops.wide` (arm_sme::aarch64_sme_mops_wide) ¶

`arm_sme.intr.read.horiz` (arm_sme::aarch64_sme_read_horiz) ¶

`arm_sme.intr.read.vert` (arm_sme::aarch64_sme_read_vert) ¶

`arm_sme.intr.smopa.wide` (arm_sme::aarch64_sme_smopa_wide) ¶

`arm_sme.intr.smopa.za32` (arm_sme::aarch64_sme_smopa_za32) ¶

`arm_sme.intr.smops.wide` (arm_sme::aarch64_sme_smops_wide) ¶

`arm_sme.intr.smops.za32` (arm_sme::aarch64_sme_smops_za32) ¶

`arm_sme.intr.st1b.horiz` (arm_sme::aarch64_sme_st1b_horiz) ¶

`arm_sme.intr.st1b.vert` (arm_sme::aarch64_sme_st1b_vert) ¶