MLIR
15.0.0git
|
#include "mlir/Dialect/Tosa/Utils/QuantUtils.h"
Go to the source code of this file.
Macros | |
#define | GET_UQTYPE(input_type) ((input_type).getElementType().dyn_cast<quant::UniformQuantizedType>()) |
#define | GET_QTYPE(input_type) ((input_type).getElementType().dyn_cast<quant::QuantizedType>()) |
Functions | |
static void | computeMultiplierAndShiftTosaScale16 (double scale, int32_t &multiplier, int32_t &shift) |
From a scale value, generates multiplier and shift values where mantissa is in [-1.0,-0.5] or [0.5, 1.0] such that multiplier = mantissa*2^shift for 16-bit scaling. More... | |
static void | computeMultiplierAndShiftTosaScale32 (double scale, int32_t &multiplier, int32_t &shift) |
From a scale value, generates multiplier and shift values where mantissa is in [-1.0,-0.5] or [0.5, 1.0] such that multiplier = mantissa*2^shift for 32-bit scaling. More... | |
#define GET_QTYPE | ( | input_type | ) | ((input_type).getElementType().dyn_cast<quant::QuantizedType>()) |
Definition at line 108 of file QuantUtils.cpp.
Referenced by mlir::tosa::buildConvOpResultTypeInfo().
#define GET_UQTYPE | ( | input_type | ) | ((input_type).getElementType().dyn_cast<quant::UniformQuantizedType>()) |
Definition at line 106 of file QuantUtils.cpp.
Referenced by mlir::tosa::buildConvOpQuantizationAttr(), mlir::tosa::buildMatMulOpQuantizationAttr(), mlir::tosa::buildPadOpQuantizationAttr(), and mlir::tosa::buildUnaryOpQuantizationAttr().
|
static |
From a scale value, generates multiplier and shift values where mantissa is in [-1.0,-0.5] or [0.5, 1.0] such that multiplier = mantissa*2^shift for 16-bit scaling.
Definition at line 22 of file QuantUtils.cpp.
References frexp(), and max().
Referenced by mlir::tosa::computeMultiplierAndShift().
|
static |
From a scale value, generates multiplier and shift values where mantissa is in [-1.0,-0.5] or [0.5, 1.0] such that multiplier = mantissa*2^shift for 32-bit scaling.
Definition at line 58 of file QuantUtils.cpp.
References frexp(), and max().
Referenced by mlir::tosa::computeMultiplierAndShift().