MLIR

Multi-Level IR Compiler Framework

-lower-quant-ops 

Lower quant.dcast and quant.qcast ops

Lower quantization (quant.qcast) and dequantization (quant.dcast) ops into other core dialects.

The lowering process generates storage type casts in the form of quant.scast ops to act as an interface between the original quantized types of operands and results and their corresponding storage types used in the generated arithmetic computations.

-strip-func-quant-types 

Strip quantized types from function headers

Identify occurrences of function arguments using a quantized type and replace them with a new value of the corresponding storage (signless integer) type. For each converted argument, a quant.scast op is introduced at the head of the function’s entry block converting the new integer argument into the original quantized value.