MLIR

Multi-Level IR Compiler Framework

'quant' Dialect

Operations 

source

quant.dcast (quant::DequantizeCastOp) 

Convert back from a quantized to quantizable (expressed) type operation

A DequantizeCast op dcast represents the inverse of a qcast, converting back from a quantized to quantizable (expressed) type.

Like qcasts, a dcast is allowed to have both its operand and result as non quantized types. This facilitates transformations and marks edges where the computation must be carried out in the expressed type.

Especially early in transformation, it is common to have dcasts on all operands to ops that must operate with the expressed type (typically math ops prior to lowering to target-specific, quantized kernels).

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: 

OperandDescription
argprimitive/tensor/vector of real valued primitive (float or quantized type)

Results: 

ResultDescription
resprimitive/tensor/vector of real valued primitive (float or quantized type)

quant.qcast (quant::QuantizeCastOp) 

Convert a quantizable type to a quantized type

A QuantizeCast qcast represents a potential type shift from a quantizable type to a quantized type.

At runtime, a qcast will apply the transformation expressed by its operand and result type. For flexibility during transformation, it is also possible to have a qcast that performs no transformation (both its operand and result type are quantizable).

A qcast will typically originate from either: a) An expressed or implied constraint in the source dialect which signals that a certain level of quantization is possible or required. b) An inference made by a quantization algorithm indicating that a quantized representation may be acceptable.

Especially early in transformation, it is common to have pairs of qcast and dcast at points where a transition to a quantized type is required. In addition, it is also common to have an identity qcast (where the operand and result type are not quantized) at all points where it is legal to use a quantized representation (but is not known to be acceptable).

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: 

OperandDescription
argprimitive/tensor/vector of real valued primitive (float or quantized type)

Results: 

ResultDescription
resprimitive/tensor/vector of real valued primitive (float or quantized type)

quant.scast (quant::StorageCastOp) 

Cast from or to a type based on the storage type and the corresponding quantized type

A StorageCast scast represents a cast from or to a type based on the storage type and a type based on a corresponding quantized type.

This op exists to ensure type coherency for between parts of the computation which are operating directly on an underlying storage type and those which operate on quantized values.

Examples from storage to quantized type:

i8 -> !quant<"uniform[i8:f32]{1.0}">
tensor<4xi8> -> tensor<4x!quant<"uniform[i8:f32]{1.0}">>
vector<4xi8> -> vector<4x!quant<"uniform[i8:f32]{1.0}">>

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands: 

OperandDescription
argreal valued or storage primitive or container type

Results: 

ResultDescription
resreal valued or storage primitive or container type

Type constraints 

UniformQuantizedType