MLIR

Multi-Level IR Compiler Framework

'arm_neon' Dialect

Operations 

source

arm_neon.2d.sdot (arm_neon::Sdot2dOp) 

Sdot op

Syntax:

operation ::= `arm_neon.2d.sdot` $a `,` $b `,` $c attr-dict `:` type($b) `,` type($c) `to` type($res)

The two input vectors b and c have a 2D shape, consisting of either 2 or 4 rows, each row having length 4. This operation computes the pair-wise dot-products of the rows of b and c and accumulates them with the corresponding entry of a:

res[i] := a[i] + dot_product(b[i, ...], c[i, ...])

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: 

OperandDescription
avector of 32-bit signless integer values of length 4/2
bvector of 8-bit signless integer values of length 16/8
cvector of 8-bit signless integer values of length 16/8

Results: 

ResultDescription
resvector of 32-bit signless integer values of length 4/2

arm_neon.intr.bfmmla (arm_neon::BfmmlaOp) 

BFloat16 matrix multiply-accumulate to single-precision

Syntax:

operation ::= `arm_neon.intr.bfmmla` $acc `,` $src1 `,` $src2 attr-dict `:` type($src1) `to` type($res)

BFMMLA: BFloat16 matrix multiply-accumulate to single-precision.

The operation multiplies the 2x4 BFloat16 matrix in the first source vector with the 4x2 BFloat16 matrix in the second source vector, then accumulates this intermediate result with the 2x2 Float32 matrix in the accumulator vector, yielding the final 2x2 Float32 result.

Source: https://developer.arm.com/architectures/instruction-sets/intrinsics/vbfmmlaq_f32

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: 

OperandDescription
acca vector with length 4 of 32-bit float values
src1a vector with length 8 of bfloat16 type values
src2a vector with length 8 of bfloat16 type values

Results: 

ResultDescription
resa vector with length 4 of 32-bit float values

arm_neon.intr.sdot (arm_neon::SdotOp) 

Sdot op

Syntax:

operation ::= `arm_neon.intr.sdot` $a `,` $b `,` $c attr-dict `:` type($b) `,` type($c) `to` type($res)

Signed integer addition of dot product (vector). This instruction performs the following operation on signed integer vectors: res = dot(b, c) + a, where vector operands are partitioned into groups of four elements.

Source: https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: 

OperandDescription
avector of 32-bit signless integer values of length 4/2
bvector of 8-bit signless integer values of length 16/8
cvector of 8-bit signless integer values of length 16/8

Results: 

ResultDescription
resvector of 32-bit signless integer values of length 4/2

arm_neon.intr.smmla (arm_neon::SmmlaOp) 

Matrix-matrix multiply and accumulate op

Syntax:

operation ::= `arm_neon.intr.smmla` $acc `,` $src1 `,` $src2 attr-dict `:` type($src1) `to` type($res)

SMMLA: Signed integer matrix multiply-accumulate.

Signed 8-bit integer matrix multiply-accumulate. This instruction multiplies the 2x8 matrix of signed 8-bit integer values in the first source vector by the 8x2 matrix of signed 8-bit integer values in the second source vector. The resulting 2x2 32-bit integer matrix product is destructively added to the 32-bit integer matrix accumulator in the destination vector. This is equivalent to performing an 8-way dot product per destination element.

Source: https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:@navigationhierarchiessimdisa=[Neon]&q=smmla

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: 

OperandDescription
acca vector with length 4 of 32-bit signless integer values
src1a vector with length 16 of 8-bit signless integer values
src2a vector with length 16 of 8-bit signless integer values

Results: 

ResultDescription
resa vector with length 4 of 32-bit signless integer values

arm_neon.intr.smull (arm_neon::SMullOp) 

Smull roundscale op

Syntax:

operation ::= `arm_neon.intr.smull` $a `,` $b attr-dict `:` type($a) `to` type($res)

Signed Multiply Long (vector). This instruction multiplies corresponding signed integer values in the lower or upper half of the vectors of the two source SIMD&FP registers, places the results in a vector, and writes the vector to the destination SIMD&FP register.

Source: https://developer.arm.com/architectures/instruction-sets/simd-isas/neon/intrinsics

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: 

OperandDescription
avector of 8-bit signless integer or 16-bit signless integer or 32-bit signless integer values of length 8/4/2
bvector of 8-bit signless integer or 16-bit signless integer or 32-bit signless integer values of length 8/4/2

Results: 

ResultDescription
resvector of 16-bit signless integer or 32-bit signless integer or 64-bit signless integer values of length 8/4/2

arm_neon.intr.ummla (arm_neon::UmmlaOp) 

Unsinged matrix-matrix multiply and accumulate op

Syntax:

operation ::= `arm_neon.intr.ummla` $acc `,` $src1 `,` $src2 attr-dict `:` type($src1) `to` type($res)

UMMLA: Signed integer matrix multiply-accumulate.

Unsigned 8-bit integer matrix multiply-accumulate. This instruction multiplies the 2x8 matrix of unsigned 8-bit integer values in the first source vector by the 8x2 matrix of unsigned 8-bit integer values in the second source vector. The resulting 2x2 32-bit integer matrix product is destructively added to the 32-bit integer matrix accumulator in the destination vector. This is equivalent to performing an 8-way dot product per destination element.

Source: https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:@navigationhierarchiessimdisa=[Neon]&q=ummla

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: 

OperandDescription
acca vector with length 4 of 32-bit signless integer values
src1a vector with length 16 of 8-bit signless integer values
src2a vector with length 16 of 8-bit signless integer values

Results: 

ResultDescription
resa vector with length 4 of 32-bit signless integer values

arm_neon.intr.usmmla (arm_neon::UsmmlaOp) 

Unsignged and signed matrix-matrix multiply and accumulate op

Syntax:

operation ::= `arm_neon.intr.usmmla` $acc `,` $src1 `,` $src2 attr-dict `:` type($src1) `to` type($res)

USMMLA: Signed integer matrix multiply-accumulate.

Unsigned and signed 8-bit integer matrix multiply-accumulate. This instruction multiplies the 2x8 matrix of unsigned 8-bit integer values in the first source vector by the 8x2 matrix of signed 8-bit integer values in the second source vector. The resulting 2x2 32-bit integer matrix product is destructively added to the 32-bit integer matrix accumulator in the destination vector. This is equivalent to performing an 8-way dot product per destination element.

Source: https://developer.arm.com/architectures/instruction-sets/intrinsics/#f:@navigationhierarchiessimdisa=[Neon]&q=usmmla

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: 

OperandDescription
acca vector with length 4 of 32-bit signless integer values
src1a vector with length 16 of 8-bit signless integer values
src2a vector with length 16 of 8-bit signless integer values

Results: 

ResultDescription
resa vector with length 4 of 32-bit signless integer values