MLIR

Multi-Level IR Compiler Framework

Tensor Operator Set Architecture (TOSA) Dialect

Rationale 

The MLIR TOSA dialect implements the TOSA specification . This document describes the decision process for how TOSA expresses operators in high level dialects.

TOSA was developed after parallel efforts to rationalize the top-down picture from multiple high-level frameworks, as well as a bottom-up view of different hardware target concerns (CPU, GPU and NPU), and reflects a set of choices that attempt to manage both sets of requirements.

TOSA and Tensor Level Expressiveness 

TOSA endeavors to provide an operator set that tries to fulfil the following expressivenes goals at the tensor level of abstraction :

Complete 

This is driven by the top-down perspective, needing to express as much of multiple high level frameworks fully in TOSA, as possible. This was originally done from an operator frequency analysis done upon dozens of high level networks in different frameworks, to select the most frequently occuring ones and establish a common set of tensor-level operators that could express them.

TOSA categorizes its operator set into classes and attempts to address major functional operations at the tensor level, including compute, reduction, elementwise transformations, comparison and control flow.

Minimal 

This takes the bottom-up approach - keep the TOSA operator set minimal in order to bound the design of hardware, operator kernels, code generation strategies and associated considerations that effect the executability of TOSA content.

In this regard TOSA seeks to avoid creating compound operators, instead leaving it to compiler backend to fuse multiple TOSA ops if required. This choice also benefits the numerical precision goal, since it is easier to fuse the numerical functionality of successive operators, than to split the numerical functionality of a compound operator.

Numerical Precision 

TOSA began as a means to address operator-level numerical precision for code generation and hardware development. It therefore incorporates precision detail into the operator set.

In this regard, TOSA operators are best understood as a combination of the visible quantization information embedded within an operation, together with the functional information about how that information is used, as described in the specification of the operation.

TOSA Operator Rationale 

The general basis of selection of the operator set that constitutes TOSA is described in the TOSA specification document under Section 1.3 Operator Selection. Explanation of the thinking behind some operators is listed here:

IDENTITYN 

tosa.IDENTITYN is used to form a list of Operator results during lowering of operations such as tf.Split from a sequence of tosa.SLICE ops. If there are alternate ways to express this lowering without the tosa.IDENTITYN op, the tosa.IDENTITYN op could be removed from TOSA.

Value lower_split_op(Value %value, size_t axis, size_t
num_split) { Value %output[]

    size_t slice_size = %value.shape[axis] / num_split

    for (int i = 0; i < num_split; i++) {
        vector <size_t> begin_vals, size_vals

        for (int j = 0; j < %value.rank; j++) {
            if (j == axis) {
               begin_vals.push_back(slice_size * i)
               size_vals.push_back(slice_size)
            } else {
               begin_vals.push_back(0)
               size_vals.push_bac(%value.shape[j])
            }

            %output[i] = tosa.SLICE(%value) {start=begin_vals, size=size_vals} (tensor<%value.type>) -> tensor<size_vals, %value.dtype>
        }

    }

    %output_list = tosa.IDENTITYN(%output) (tensor<%output:*.type>) -> tensor<%output_list:*.type>
    return %output_list
}

COND_IF and WHILE_LOOP 

Several neural networks express conditional control flow at the tensor level. A survey of multiple high level frameworks indicated that conditional if and a loop construct are common in all major frameworks, with some variation. Since TOSA endeavors to be complete in expressing tensor level functionality including control flow, it implements these constructs.

The COND_IF and WHILE_LOOP operators implement such structured control flow forms and should be lowerable to corresponding ops in the scf dialect. Since the dialect seeks to remain isomorphic with an external, serialized form, the decision was to keep these ops in the dialect (as opposed to deferring completely to scf), and this may be re-evaluated if this turns out to not yield the expected value.

Using TOSA In A Compiler 

The TOSA specification describes each operator in functional detail. It is expected that compilers that use TOSA will use its builders to construct the operators so that the quantization information for the operator is correctly generated.

The functional steps described in the pseudocode of the specification enables the construction of code generation for that operation, or decisions on the design of underlying hardware. The functional pseudocode also describes how the quantization parameters are utilized within the operation.

Quantization Parameters in Ops vs Tensors 

TOSA uses the quantization parameters embedded in the input and output tensors to construct the quantization attributes that sit within the operator. Once these attributes are constructed, the quantization information within the tensors are no longer necessary for code generation.

This enables the tensors to be subsequently interpreted simply as contiguous buffers containing raw data, with no ‘meta information’ in the form of the quantization_type. Precision related manipulation of the input or output are instead described by the operator itself which describes, for example, when the zero point is applied, or when the scale multiplication is done.

However, TOSA does not eliminate the existing MLIR QuantOps quantization type information within the tensors; this leaves the choice of how to handle quantization information, to later backend code generation steps.

Maintaining the ability to overlap these different representations of quantization parameters (i.e. tensor-carried vs op-carried) is an important capability when considering progressive lowering between uses that expect one scheme vs the other.

Operation definitions 

tosa.abs (mlir::tosa::AbsOp) 

Elementwise abs op

Elementwise absolute value operation

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.add (mlir::tosa::AddOp) 

Elementwise addition operator

Elementwise addition of input1 and input2. Axis of size 1 will be broadcast, as necessary. Rank of input tensors must match.

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values
input20D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.argmax (mlir::tosa::ArgMaxOp) 

Perform argmax on the input.

This returns the index with the largest value across the given axis of the input tensor.

Attributes: 

AttributeMLIR TypeDescription
axis::mlir::IntegerAttr64-bit signless integer attribute

Operands: 

OperandDescription
input1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.arithmetic_right_shift (mlir::tosa::ArithmeticRightShiftOp) 

Elementwise Arithmetic Right Shift

Elementwise arithmetic right shift of input1 by the amount specified in input2. Axis of size 1 will be broadcast, as necessary. Rank of input tensors must match.

Attributes: 

AttributeMLIR TypeDescription
round::mlir::BoolAttrbool attribute

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values
input20D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.avg_pool2d (mlir::tosa::AvgPool2dOp) 

Performs max pooling on the input.

This performs an average pooling over the given input tensor. A sliding window of size given by is passed over the input tensor, with the mean value being placed in the output tensor.

Attributes: 

AttributeMLIR TypeDescription
kernel::mlir::ArrayAttr64-bit integer array attribute with exactly 2 elements
stride::mlir::ArrayAttr64-bit integer array attribute with exactly 2 elements
pad::mlir::ArrayAttr64-bit integer array attribute with exactly 4 elements
quantization_infomlir::tosa::UnaryOpQuantizationAttrAttribute for UnaryOp quantization information.

Operands: 

OperandDescription
input4D tensor of number values

Results: 

ResultDescription
output4D tensor of number values

tosa.bitwise_and (mlir::tosa::BitwiseAndOp) 

Bitwise AND operator

Elementwise bitwise AND of input1 and input2. Axis of size 1 will be broadcast as necessary. Rank of input tensors must match.

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values
input20D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.bitwise_not (mlir::tosa::BitwiseNotOp) 

Bitwise NOT operator

Elementwise bitwise NOT of input tensor.

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.bitwise_or (mlir::tosa::BitwiseOrOp) 

Bitwise OR operator

Elementwise bitwise OR of input1 and input2. Axis of size 1 will be broadcast as necessary. Rank of input tensors must match.

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values
input20D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.bitwise_xor (mlir::tosa::BitwiseXorOp) 

Bitwise XOR operator

Elementwise bitwise XOR of input1 and input2. Axis of size 1 will be broadcast as necessary. Rank of input tensors must match.

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values
input20D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.cast (mlir::tosa::CastOp) 

Cast operation

Performs a set of permissible cast operations Mode Input Output ————————————— signed 8 to bool int8 Boolean signed 16 to bool int16 Boolean signed 32 to bool int32 Boolean bool to 8 Boolean int8 bool to 16 Boolean int16 bool to 32 Boolean int32 signed 8 to signed 16 int8 int16 signed 8 to signed 32 int8 int32 signed 16 to signed 8 int16 int8 signed 16 to signed 32 int16 int32 signed 32 to signed 8 int32 int8 signed 32 to signed 16 int32 int16 float to signed 8 float int8 float to signed 16 float int16 signed 8 to float int8 float signed 16 to float int16 float

Operands: 

OperandDescription
input0D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.ceil (mlir::tosa::CeilOp) 

Elementwise ceil op

Elementwise ceiling operation

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.clamp (mlir::tosa::ClampOp) 

Computes clamp(features, min, max).

Clamp to an arbitrary minimum and maximum value. Note that the maximum and minimum values are specified as signed quantized values, no scaling happens before or after this operation.

Attributes: 

AttributeMLIR TypeDescription
min_int::mlir::IntegerAttr64-bit signless integer attribute
max_int::mlir::IntegerAttr64-bit signless integer attribute
min_fp::mlir::FloatAttr32-bit float attribute
max_fp::mlir::FloatAttr32-bit float attribute

Operands: 

OperandDescription
input1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output1D/2D/3D/4D tensor of number values

tosa.clz (mlir::tosa::ClzOp) 

Elementwise count leading zero op

Elementwise count leading zeros operation

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.concat (mlir::tosa::ConcatOp) 

Concatenates tensors along one dimension.

Concatenate two tensors along a given axis. No data conversion happens during a concat operation.

Attributes: 

AttributeMLIR TypeDescription
axis::mlir::IntegerAttr64-bit signless integer attribute

Operands: 

OperandDescription
input11D/2D/3D/4D tensor of number values
input21D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output1D/2D/3D/4D tensor of number values

tosa.const (mlir::tosa::ConstOp) 

Constant op.

A node containing constant data for use as the input to an operation. May hold data in any of the supported data formats.

Attributes: 

AttributeMLIR TypeDescription
value::mlir::ElementsAttrconstant vector/tensor attribute

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.conv2d (mlir::tosa::Conv2DOp) 

2D Convolution Operator

Performs a 2D convolution over the given tensor input, using the weight tensor.

Attributes: 

AttributeMLIR TypeDescription
pad::mlir::ArrayAttr64-bit integer array attribute with exactly 4 elements
stride::mlir::ArrayAttr64-bit integer array attribute with exactly 2 elements
dilation::mlir::ArrayAttr64-bit integer array attribute with exactly 2 elements
quantization_infomlir::tosa::ConvOpQuantizationAttrAttribute for Conv type op quantization information.

Operands: 

OperandDescription
input4D tensor of number values
weight4D tensor of number values
bias1D tensor of number values

Results: 

ResultDescription
output4D tensor of number values

tosa.conv3d (mlir::tosa::Conv3DOp) 

3D Convolution operator

Performs a 3D convolution over the given input tensor.

Attributes: 

AttributeMLIR TypeDescription
pad::mlir::ArrayAttr64-bit integer array attribute with exactly 6 elements
stride::mlir::ArrayAttr64-bit integer array attribute with exactly 3 elements
dilation::mlir::ArrayAttr64-bit integer array attribute with exactly 3 elements
quantization_infomlir::tosa::ConvOpQuantizationAttrAttribute for Conv type op quantization information.

Operands: 

OperandDescription
input5D tensor of number values
weight5D tensor of number values
bias1D tensor of number values

Results: 

ResultDescription
output5D tensor of number values

tosa.custom (mlir::tosa::CustomOp) 

Custom operator wrapper for Tosa

Hardware implementing TOSA may choose to add additional custom operators that are not expressed in the existing TOSA operations. These operators are not expected to be portable across TOSA implementations. The input and output signatures must be expressed in the corresponding TOSA node.

Attributes: 

AttributeMLIR TypeDescription
identifier::mlir::StringAttrstring attribute

Operands: 

OperandDescription
inputstensor of number values

Results: 

ResultDescription
outputstensor of number values

tosa.depthwise_conv2d (mlir::tosa::DepthwiseConv2DOp) 

Depthwise 2D Convolution operator

Performs 2D convolutions separately over each channel of the given tensor input, using the weight tensor.

Attributes: 

AttributeMLIR TypeDescription
pad::mlir::ArrayAttr64-bit integer array attribute with exactly 4 elements
stride::mlir::ArrayAttr64-bit integer array attribute with exactly 2 elements
dilation::mlir::ArrayAttr64-bit integer array attribute with exactly 2 elements
quantization_infomlir::tosa::ConvOpQuantizationAttrAttribute for Conv type op quantization information.

Operands: 

OperandDescription
input4D tensor of number values
weight4D tensor of number values
bias1D tensor of number values

Results: 

ResultDescription
output4D tensor of number values

tosa.equal (mlir::tosa::EqualOp) 

Returns the truth value of (x == y) element-wise.

Elementwise comparison operation

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values
input20D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
outputtensor of 1-bit signless integer values

tosa.exp (mlir::tosa::ExpOp) 

Elementwise exp op

Elementwise e to the x operation

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.floor (mlir::tosa::FloorOp) 

Elementwise floor op

Elementwise floor operation

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.fully_connected (mlir::tosa::FullyConnectedOp) 

Fully Connected operator

Performs a fully connected network.

Attributes: 

AttributeMLIR TypeDescription
quantization_infomlir::tosa::ConvOpQuantizationAttrAttribute for Conv type op quantization information.

Operands: 

OperandDescription
input2D tensor of number values
weight2D tensor of number values
bias1D tensor of number values

Results: 

ResultDescription
output2D tensor of number values

tosa.gather (mlir::tosa::GatherOp) 

Gather operation,

Generate a tensor for which each element in the output is a subtensor of the values tensor along the given axis, based on the value of indices.

Attributes: 

AttributeMLIR TypeDescription
axis::mlir::IntegerAttr32-bit signless integer attribute

Operands: 

OperandDescription
indicestensor of 32-bit signless integer or 64-bit signless integer values
values1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output1D/2D/3D/4D tensor of number values

tosa.greater_equal (mlir::tosa::GreaterEqualOp) 

Returns the truth value of (x >= y) element-wise.

Elementwise comparison operation

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values
input20D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
outputtensor of 1-bit signless integer values

tosa.greater (mlir::tosa::GreaterOp) 

Returns the truth value of (x > y) element-wise.

Elementwise greater than comparison operation

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values
input20D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
outputtensor of 1-bit signless integer values

tosa.identityn (mlir::tosa::IdentityNOp) 

IdentityN operator

Returns a list of tensors with the same shape, type, and contents as the input list of tensors.

Operands: 

OperandDescription
input10D/1D/2D/3D/4D/5D/6D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D/5D/6D tensor of number values

tosa.identity (mlir::tosa::IdentityOp) 

Identity operator

Returns a tensor with the same shape, size, type and content as the input.

Operands: 

OperandDescription
input10D/1D/2D/3D/4D/5D/6D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D/5D/6D tensor of number values

tosa.cond_if (mlir::tosa::IfOp) 

Conditional if operator

Evaluates a Boolean condition and then takes one of two distinct execution paths. This implements the semantic If-then-else structure.

Operands: 

OperandDescription
condtensor of 1-bit signless integer values
inputstensor of number values

Results: 

ResultDescription
outputtensor of number values

tosa.log (mlir::tosa::LogOp) 

Elementwise log op

Elementwise natural logarithm operation

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.logical_and (mlir::tosa::LogicalAndOp) 

Returns the truth value of x AND y element-wise.

Elementwise logical AND of input1 and input2. Axis of size 1 will be broadcast, as necessary. Rank of input tensors must match.

Operands: 

OperandDescription
input1tensor of 1-bit signless integer values
input2tensor of 1-bit signless integer values

Results: 

ResultDescription
ztensor of 1-bit signless integer values

tosa.logical_left_shift (mlir::tosa::LogicalLeftShiftOp) 

Elementwise Logical Left Shift

Elementwise left shift of input1 and input2. Axis of size 1 will be broadcast, as necessary. Rank of input tensors must match.

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values
input20D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.logical_not (mlir::tosa::LogicalNotOp) 

Returns the truth value of NOT x element-wise.

Elementwise logical NOT of input.

Operands: 

OperandDescription
input1tensor of 1-bit signless integer values

Results: 

ResultDescription
outputtensor of 1-bit signless integer values

tosa.logical_or (mlir::tosa::LogicalOrOp) 

Returns the truth value of x OR y element-wise.

Elementwise logical OR of input1 and input2. Axis of size 1 will be broadcast as necessary. Rank of input tensors must match.

Operands: 

OperandDescription
input1tensor of 1-bit signless integer values
input2tensor of 1-bit signless integer values

Results: 

ResultDescription
ztensor of 1-bit signless integer values

tosa.logical_right_shift (mlir::tosa::LogicalRightShiftOp) 

Elementwise Logical Right Shift

Elementwise logical right shift of input1 by the amount specified in input2. Axis of size 1 will be broadcast, as necessary. Rank of input tensors must match.

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values
input20D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.logical_xor (mlir::tosa::LogicalXorOp) 

Returns the truth value of x XOR y element-wise.

Elementwise logical XOR of input1 and input2. Axis of size 1 will be broadcast as necessary. Rank of input tensors must match.

Operands: 

OperandDescription
input1tensor of 1-bit signless integer values
input2tensor of 1-bit signless integer values

Results: 

ResultDescription
ztensor of 1-bit signless integer values

tosa.matmul (mlir::tosa::MatMulOp) 

Matrix multiplication with bias

Performs a two dimensional matrix multiplication. This allows both inputs to be activations, rather than reserving weights as an attribute in the FULLY_CONNECTED operator.

Attributes: 

AttributeMLIR TypeDescription
quantization_infomlir::tosa::MatMulOpQuantizationAttrAttribute for MatMulOp quantization information.

Operands: 

OperandDescription
a2D tensor of number values
b2D tensor of number values

Results: 

ResultDescription
c2D tensor of number values

tosa.max_pool2d (mlir::tosa::MaxPool2dOp) 

Performs max pooling on the input.

This performs a max pooling over the given input tensor. A sliding window of size given by is passed over the input tensor, with the maximum value being placed in the output tensor.

Attributes: 

AttributeMLIR TypeDescription
kernel::mlir::ArrayAttr64-bit integer array attribute with exactly 2 elements
stride::mlir::ArrayAttr64-bit integer array attribute with exactly 2 elements
pad::mlir::ArrayAttr64-bit integer array attribute with exactly 4 elements

Operands: 

OperandDescription
input4D tensor of number values

Results: 

ResultDescription
output4D tensor of number values

tosa.maximum (mlir::tosa::MaximumOp) 

Elementwise Maximum

Elementwise max of input1 and input2. Axis of size 1 will be broadcast, as necessary. Rank of input tensors must match.

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values
input20D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.minimum (mlir::tosa::MinimumOp) 

Elementwise Minimum

Elementwise minimum of input1 and input2. Axis of size 1 will be broadcast, as necessary. Rank of input tensors must match.

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values
input20D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.mul (mlir::tosa::MulOp) 

Multiplication operator

Elementwise multiplication (Hadamard product) of input1 and input2. Axis of size 1 will be broadcast, as necessary. Rank of input tensors must match.

Attributes: 

AttributeMLIR TypeDescription
shift::mlir::IntegerAttr32-bit signless integer attribute

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values
input20D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.negate (mlir::tosa::NegateOp) 

Elementwise negate op

Elementwise negation operation

Attributes: 

AttributeMLIR TypeDescription
quantization_infomlir::tosa::UnaryOpQuantizationAttrAttribute for UnaryOp quantization information.

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.pad (mlir::tosa::PadOp) 

Pads a tensor with zeros.

Zero-pads a tensor along borders of each dimension.

Attributes: 

AttributeMLIR TypeDescription
quantization_infomlir::tosa::PadOpQuantizationAttrAttribute for PadOp quantization information.

Operands: 

OperandDescription
input11D/2D/3D/4D tensor of number values
paddingtensor of 32-bit signless integer or 64-bit signless integer values

Results: 

ResultDescription
output1D/2D/3D/4D tensor of number values

tosa.placeholder (mlir::tosa::PlaceholderOp) 

Placeholder op

A node where data will be inserted into the network at runtime. Generally used for inputs to the network.

Results: 

ResultDescription
output1D/2D/3D/4D tensor of number values

tosa.pow (mlir::tosa::PowOp) 

Computes the power of one value to another.

Elementwise input1 raised to the power of input2. Axis of size 1 will be broadcast, as necessary. Rank of input tensors must match.

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values
input20D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
z0D/1D/2D/3D/4D tensor of number values

tosa.reciprocal (mlir::tosa::ReciprocalOp) 

Elementwise reciprocal op

Elementwise reciprocal operation. For integer operation, a TABLE should be used with the appropriate ranges.

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.reduce_all (mlir::tosa::ReduceAllOp) 

Reduce All operator

Reduce a tensor along the given axis with a logical AND operation

Attributes: 

AttributeMLIR TypeDescription
axis::mlir::IntegerAttr64-bit signless integer attribute

Operands: 

OperandDescription
input1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output1D/2D/3D/4D tensor of number values

tosa.reduce_any (mlir::tosa::ReduceAnyOp) 

Reduce Any operator

Reduce a tensor along the given axis with a logical OR operation

Attributes: 

AttributeMLIR TypeDescription
axis::mlir::IntegerAttr64-bit signless integer attribute

Operands: 

OperandDescription
input1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output1D/2D/3D/4D tensor of number values

tosa.reduce_max (mlir::tosa::ReduceMaxOp) 

Reduce Max operator

Reduce a tensor along the given axis with a maximum operation

Attributes: 

AttributeMLIR TypeDescription
axis::mlir::IntegerAttr64-bit signless integer attribute

Operands: 

OperandDescription
input1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output1D/2D/3D/4D tensor of number values

tosa.reduce_min (mlir::tosa::ReduceMinOp) 

Reduce Min operator

Reduce a tensor along the given axis with a minimum operation

Attributes: 

AttributeMLIR TypeDescription
axis::mlir::IntegerAttr64-bit signless integer attribute

Operands: 

OperandDescription
input1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output1D/2D/3D/4D tensor of number values

tosa.reduce_prod (mlir::tosa::ReduceProdOp) 

Reduce Prod operator

Reduce a tensor along the given axis by computing the product of the axis.

Attributes: 

AttributeMLIR TypeDescription
axis::mlir::IntegerAttr64-bit signless integer attribute

Operands: 

OperandDescription
input1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output1D/2D/3D/4D tensor of number values

tosa.reduce_sum (mlir::tosa::ReduceSumOp) 

Reduce Sum operator

Reduce a tensor along the given axis by computing the sum of the axis.

Attributes: 

AttributeMLIR TypeDescription
axis::mlir::IntegerAttr64-bit signless integer attribute

Operands: 

OperandDescription
input1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output1D/2D/3D/4D tensor of number values

tosa.reluN (mlir::tosa::ReluNOp) 

Computes rectified linear: max(features, N).

ReLU with a scalar maximum value.

Attributes: 

AttributeMLIR TypeDescription
max_int::mlir::IntegerAttr64-bit signless integer attribute
max_fp::mlir::FloatAttr32-bit float attribute

Operands: 

OperandDescription
input1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output1D/2D/3D/4D tensor of number values

tosa.rescale (mlir::tosa::RescaleOp) 

Tosa rescale operator

Rescale quantized values into a new domain. Supported rescalings are: Mode Input Output signed 8 to 8 aint8 aint8 signed 8 to 16 aint8 int16 signed 8 to 32 aint8 int32 signed 16 to 8 int16 aint8 signed 16 to 16 int16 int16 signed 16 to 32 int16 int32 signed 32 to 8 int32 aint8 signed 32 to 16 int32 int16 signed 32 to 32 int32 int32 signed 48 to 8 int48 aint8 signed 48 to 16 int48 int16 signed 48 to 32 int48 int32 unsigned 8 to signed 8 uint8 aint8 signed 8 to unsigned 8 aint8 uint8

Attributes: 

AttributeMLIR TypeDescription
input_zp::mlir::IntegerAttr32-bit signless integer attribute
output_zp::mlir::IntegerAttr32-bit signless integer attribute
multiplier::mlir::ArrayAttr32-bit integer array attribute
shift::mlir::ArrayAttr32-bit integer array attribute
scale32::mlir::BoolAttrbool attribute
double_round::mlir::BoolAttrbool attribute
per_channel::mlir::BoolAttrbool attribute

Operands: 

OperandDescription
input0D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.reshape (mlir::tosa::ReshapeOp) 

Reshape operator

Returns a tensor with the same type/values as the input, with a new shape specified by the shape argument. Reshape may operate on tensors of any rank. No data conversion happens during a reshape operation.

Attributes: 

AttributeMLIR TypeDescription
new_shape::mlir::ArrayAttr64-bit integer array attribute

Operands: 

OperandDescription
input10D/1D/2D/3D/4D/5D/6D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D/5D/6D tensor of number values

tosa.resize (mlir::tosa::ResizeOp) 

Resize operation, supports various resize/upsample modes

Resizes a tensor. Resize is only allowed in the H and W dimensions. In expected use, stride_y is approximately (IH«shift)/OH and stride_x is approximately (IW«shift)/OW. OH and OW are also supplied as inputs since there may be off by one errors if calculating OH and OW from the strides.

Attributes: 

AttributeMLIR TypeDescription
output_size::mlir::ArrayAttr64-bit integer array attribute with exactly 2 elements
stride::mlir::ArrayAttr64-bit integer array attribute with exactly 2 elements
offset::mlir::ArrayAttr64-bit integer array attribute with exactly 2 elements
shift::mlir::IntegerAttr32-bit signless integer attribute
mode::mlir::StringAttrSupported resize/upsampling strategies

Operands: 

OperandDescription
input4D tensor of number values

Results: 

ResultDescription
output4D tensor of number values

tosa.reverse (mlir::tosa::ReverseOp) 

Reverse operator

Returns a tensor with the same type/values as the input, with the data reversed along the given axis. No data conversion happens during a reverse operation.

Attributes: 

AttributeMLIR TypeDescription
axis::mlir::IntegerAttr64-bit signless integer attribute

Operands: 

OperandDescription
input1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output1D/2D/3D/4D tensor of number values

tosa.rsqrt (mlir::tosa::RsqrtOp) 

Elementwise 1/sqrt op

Elementwise reciprocal square root operation. For integer operation, a TABLE should be used with the appropriate ranges.

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.select (mlir::tosa::SelectOp) 

Elementwise select operator

Elementwise select of the output based on a condition.

Operands: 

OperandDescription
input1tensor of 1-bit signless integer values
input20D/1D/2D/3D/4D tensor of number values
input30D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.sigmoid (mlir::tosa::SigmoidOp) 

Computes elementwise sigmoid of input.

Sigmoid function: output = 1 / (1 + exp(-input)) For quantized integer data types, the TABLE operator should be used instead with the following definition. The sigmoid table has 513 entries each of 16-bit precision and covering the input range -16.0 to +16.0 in steps of 1/16.

Operands: 

OperandDescription
input1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output1D/2D/3D/4D tensor of number values

tosa.slice (mlir::tosa::SliceOp) 

Slice operator

Extracts a slice of the input1 on the given axis, beginning at the start coordinates, and extending for size elements in each direction. No data conversion happens during a slice operation.

Attributes: 

AttributeMLIR TypeDescription
start::mlir::ArrayAttr64-bit integer array attribute
size::mlir::ArrayAttr64-bit integer array attribute

Operands: 

OperandDescription
input1D/2D/3D/4D/5D/6D tensor of number values

Results: 

ResultDescription
output1D/2D/3D/4D/5D/6D tensor of number values

tosa.sub (mlir::tosa::SubOp) 

Elementwise subtraction operator

Elementwise subtraction of input1 and input2. Axis of size 1 will be broadcast as necessary. Rank of input tensors must match.

Operands: 

OperandDescription
input10D/1D/2D/3D/4D tensor of number values
input20D/1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.table (mlir::tosa::TableOp) 

Table lookup op

Interpolated table lookup operation. Input values are scaled to create a fixed-point 9.7 value. The high 9 bits are used to index into the table. The fractional bits are used to interpolate based on the looked up value and the index+1 value in the table. The TABLE operator then returns a 16.7 interpolated value. Note that there must be 513 values to handle the full range of inputs.

The TABLE operator is expected to be used as follows:

  • A RESCALE node is expected before the TABLE operator to scale the input to a full int16_t range for the table lookup
  • If an int16_t result is required then follow the TABLE operator with a RESCALE with a right shift of 7
  • If an int8_t result is required then follow the TABLE operator with a RESCALE with a right shift of 15

Operands: 

OperandDescription
input0D/1D/2D/3D/4D tensor of number values
table1D tensor of number values

Results: 

ResultDescription
output0D/1D/2D/3D/4D tensor of number values

tosa.tanh (mlir::tosa::TanhOp) 

Computes elementwise hyperbolic tangent of input

Parameterized hyperbolic tangent. For quantized integer data types, the TABLE operator should be used instead with the following definition. The tanh_table has 513 entries each of 16-bit precision and covering the input range -8.0 to +8.0 in steps of 1/32.

Operands: 

OperandDescription
input1D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output1D/2D/3D/4D tensor of number values

tosa.tile (mlir::tosa::TileOp) 

Tile operator

Replicates input 0 multiplies times along each dimension.

Attributes: 

AttributeMLIR TypeDescription
multiples::mlir::ArrayAttr64-bit integer array attribute

Operands: 

OperandDescription
input11D/2D/3D/4D tensor of number values

Results: 

ResultDescription
output1D/2D/3D/4D tensor of number values

tosa.transpose_conv2d (mlir::tosa::TransposeConv2DOp) 

Transpose 2D Convolution operator.

Performs a 2D transposed convolution over the given tensor input, using the weights tensor.

Attributes: 

AttributeMLIR TypeDescription
out_pad::mlir::ArrayAttr64-bit integer array attribute with exactly 2 elements
stride::mlir::ArrayAttr64-bit integer array attribute with exactly 2 elements
dilation::mlir::ArrayAttr64-bit integer array attribute with exactly 2 elements
out_shape::mlir::ArrayAttr64-bit integer array attribute with at least 4 elements
quantization_infomlir::tosa::ConvOpQuantizationAttrAttribute for Conv type op quantization information.

Operands: 

OperandDescription
input4D tensor of number values
filter4D tensor of number values
bias1D tensor of number values

Results: 

ResultDescription
output4D tensor of number values

tosa.transpose (mlir::tosa::TransposeOp) 

Transpose operator

Permutes the dimensions based on perm.

Operands: 

OperandDescription
input11D/2D/3D/4D/5D/6D tensor of number values
permstensor of 32-bit signless integer or 64-bit signless integer values

Results: 

ResultDescription
output1D/2D/3D/4D/5D/6D tensor of number values

tosa.while_loop (mlir::tosa::WhileOp) 

output = input; While (Cond(output)) {output = Body(output)}

Generates and evaluates a Bool condition and either executes a loop body or exits to another control point. This action is performed repeatedly after updating and re-evaluating the Boolean condition every iteration. This implements the semantic foreach or while iterative loop structure.

Operands: 

OperandDescription
inputstensor of number values

Results: 

ResultDescription
outputtensor of number values

tosa.yield (mlir::tosa::YieldOp) 

yield operator

return operation within the conditional and body of structured control flow. Operation takes variadic operands but produces no results of its own.

Operands: 

OperandDescription
inputstensor of number values