MLIR

Multi-Level IR Compiler Framework

'quant' Dialect

Type definition 

UniformQuantizedType 

Operation definition 

quant.const_fake_quant (quant::ConstFakeQuant) 

Simulates the effect of uniform quantization with const range.

Given a const min, max, num_bits and narrow_range attribute, applies the same uniform quantization simulation as is done by the TensorFlow fake_quant_with_min_max_args op. See the fakeQuantAttrsToType() utility method and the quant-convert-simulated-quantization pass for further details.

Attributes: 

AttributeMLIR TypeDescription
minFloatAttr32-bit float attribute
maxFloatAttr32-bit float attribute
num_bitsIntegerAttr64-bit signless integer attribute
narrow_rangeBoolAttrbool attribute
is_signedBoolAttrbool attribute

Operands: 

OperandDescription
inputstensor of 32-bit float values

Results: 

ResultDescription
outputstensor of 32-bit float values

quant.const_fake_quant_per_axis (quant::ConstFakeQuantPerAxis) 

Simulates the effect of per axis uniform quantization with const range.

Given a const min, max, num_bits and narrow_range attribute, applies the same per axis uniform quantization simulation as is done by the TensorFlow fake_quant_with_min_max_vars_per_channel op. See the fakeQuantAttrsToType() utility method and the quant-convert-simulated-quantization pass for further details.

Attributes: 

AttributeMLIR TypeDescription
minArrayAttr32-bit float array attribute
maxArrayAttr32-bit float array attribute
axisIntegerAttr64-bit signless integer attribute
num_bitsIntegerAttr64-bit signless integer attribute
narrow_rangeBoolAttrbool attribute
is_signedBoolAttrbool attribute

Operands: 

OperandDescription
inputstensor of 32-bit float values

Results: 

ResultDescription
outputstensor of 32-bit float values

quant.coupled_ref (quant::CoupledRefOp) 

Indicates that one point of the computation is coupled to another.

Ordinarily, relationships between ops for the purposes of determining compatible quantized types is explicit based on the use-def chain. However, in some situations, a use may be separated from its def by arbitrary external connections. In such a case, during analysis, all coupled_ref nodes in a module which share a coupledKey will be considered to be directly connected as via an identity op for the purpose of type inference.

Attributes: 

AttributeMLIR TypeDescription
coupledKeyStringAttrstring attribute

Operands: 

OperandDescription
argprimitive/tensor/vector of real valued primitive (float or quantized type)

Results: 

ResultDescription
«unnamed»primitive/tensor/vector of real valued primitive (float or quantized type)

quant.dcast (quant::DequantizeCastOp) 

Operands: 

OperandDescription
argprimitive/tensor/vector of real valued primitive (float or quantized type)

Results: 

ResultDescription
«unnamed»primitive/tensor/vector of real valued primitive (float or quantized type)

quant.qcast (quant::QuantizeCastOp) 

Operands: 

OperandDescription
argprimitive/tensor/vector of real valued primitive (float or quantized type)

Results: 

ResultDescription
«unnamed»primitive/tensor/vector of real valued primitive (float or quantized type)

quant.region (quant::QuantizeRegionOp) 

The `region` operation wraps high-precision ops as a logical low-precision
quantized kernel.

Attributes: 

AttributeMLIR TypeDescription
input_specsArrayAttrtype array attribute
output_specsArrayAttrtype array attribute
logical_kernelStringAttrstring attribute

Operands: 

OperandDescription
inputsany type

Results: 

ResultDescription
outputsany type

quant.return (quant::ReturnOp) 

The `return` operation terminates a quantize region and returns values.

Operands: 

OperandDescription
resultstensor of any type values

quant.stats (quant::StatisticsOp) 

Identity op which associates statistics with the value.

Associates statistics about the runtime ranges of values observed for evaluations of this node.

Statistics about the entire type are reported in the ‘layerStats’ attribute and those for each axis, in the (optional) axisStats attribute. The interpretation of each is determined by the last dimension of its shape. Currently, only dim=2 is supported, which is interpreted as [min, max].

layerStats must be a rank 1 tensor: [2] axisStats must be a rank 2 tensor: [N, 2], where N=the slice size splitted by the axis dimension. For example:

<?x?x3x2>, axis=3 => N=2
<?x?x3x2>, axis=2 => N=6

Attributes: 

AttributeMLIR TypeDescription
layerStatsElementsAttrconstant vector/tensor attribute
axisStatsElementsAttrconstant vector/tensor attribute
axisIntegerAttr64-bit signless integer attribute

Operands: 

OperandDescription
argprimitive/tensor/vector of real valued primitive (float or quantized type)

Results: 

ResultDescription
«unnamed»primitive/tensor/vector of real valued primitive (float or quantized type)

quant.stats_ref (quant::StatisticsRefOp) 

Indicates that statistics are resolved by reference.

This op acts as an identity that, when encountered at runtime, should result in statistics being collected about about the value of its operand/result. Such statistics will be stored with the provided key, allowing this node to later be converted to a ‘stats’ op if statistics with that key have been encountered.

Attributes: 

AttributeMLIR TypeDescription
statsKeyStringAttrstring attribute

Operands: 

OperandDescription
argprimitive/tensor/vector of real valued primitive (float or quantized type)

Results: 

ResultDescription
«unnamed»primitive/tensor/vector of real valued primitive (float or quantized type)

quant.scast (quant::StorageCastOp) 

Operands: 

OperandDescription
argquant_RealOrStorageValueType

Results: 

ResultDescription
«unnamed»quant_RealOrStorageValueType