Represents sub-channel (also known as blockwise quantization). More...

#include "mlir/Dialect/Quant/IR/QuantTypes.h"

Inheritance diagram for mlir::quant::UniformQuantizedSubChannelType:

Public Types
using	Base = StorageUserBase< ConcreteT, BaseT, StorageT, UniquerT, Traits... >
	Utility declarations for the concrete attribute class. More...

Public Types inherited from mlir::detail::StorageUserBase< ConcreteT, BaseT, StorageT, UniquerT, Traits >
using	Base = StorageUserBase< ConcreteT, BaseT, StorageT, UniquerT, Traits... >
	Utility declarations for the concrete attribute class. More...

using	ImplType = StorageT

using	HasTraitFn = bool(*)(TypeID)

Public Member Functions
DenseElementsAttr	getScales () const
	Gets the quantization scales. More...

DenseElementsAttr	getZeroPoints () const
	Gets the quantization zero-points. More...

ArrayRef< int32_t >	getQuantizedDimensions () const
	Gets the quantized dimensions. More...

ArrayRef< int64_t >	getBlockSizes () const
	Gets the block sizes for the quantized dimensions. More...

const SmallVector< std::pair< int32_t, int64_t > >	getBlockSizeInfo () const
	Gets the block size information. More...

Public Member Functions inherited from mlir::detail::StorageUserBase< ConcreteT, BaseT, StorageT, UniquerT, Traits >
ImplType *	getImpl () const
	Utility for easy access to the storage instance. More...

Static Public Member Functions
static UniformQuantizedSubChannelType	get (unsigned flags, Type storageType, Type expressedType, DenseElementsAttr scales, DenseElementsAttr zeroPoints, ArrayRef< int32_t > quantizedDimensions, ArrayRef< int64_t > blockSizes, int64_t storageTypeMin, int64_t storageTypeMax)
	Gets an instance of the type with all parameters specified but not checked. More...

static UniformQuantizedSubChannelType	getChecked (function_ref< InFlightDiagnostic()> emitError, unsigned flags, Type storageType, Type expressedType, DenseElementsAttr scales, DenseElementsAttr zeroPoints, ArrayRef< int32_t > quantizedDimensions, ArrayRef< int64_t > blockSizes, int64_t storageTypeMin, int64_t storageTypeMax)
	Gets an instance of the type with all specified parameters checked. More...

static LogicalResult	verifyInvariants (function_ref< InFlightDiagnostic()> emitError, unsigned flags, Type storageType, Type expressedType, DenseElementsAttr scales, DenseElementsAttr zeroPoints, ArrayRef< int32_t > quantizedDimensions, ArrayRef< int64_t > blockSizes, int64_t storageTypeMin, int64_t storageTypeMax)
	Verifies construction invariants and issues errors/warnings. More...

template<typename... Args>
static ConcreteT	getChecked (const Location &loc, Args &&...args)
	Get or create a new ConcreteT instance within the ctx, defined at the given, potentially unknown, location. More...

template<typename... Args>
static ConcreteT	getChecked (function_ref< InFlightDiagnostic()> emitErrorFn, MLIRContext *ctx, Args... args)
	Get or create a new ConcreteT instance within the ctx. More...

Static Public Member Functions inherited from mlir::detail::StorageUserBase< ConcreteT, BaseT, StorageT, UniquerT, Traits >
static TypeID	getTypeID ()
	Return a unique identifier for the concrete type. More...

template<typename T >
static bool	classof (T val)
	Provide an implementation of 'classof' that compares the type id of the provided value with that of the concrete type. More...

static detail::InterfaceMap	getInterfaceMap ()
	Returns an interface map for the interfaces registered to this storage user. More...

static HasTraitFn	getHasTraitFn ()
	Returns the function that returns true if the given Trait ID matches the IDs of any of the traits defined by the storage user. More...

static auto	getWalkImmediateSubElementsFn ()
	Returns a function that walks immediate sub elements of a given instance of the storage user. More...

static auto	getReplaceImmediateSubElementsFn ()
	Returns a function that replaces immediate sub elements of a given instance of the storage user. More...

template<typename... IfaceModels>
static void	attachInterface (MLIRContext &context)
	Attach the given models as implementations of the corresponding interfaces for the concrete storage user class. More...

template<typename... Args>
static ConcreteT	get (MLIRContext *ctx, Args &&...args)
	Get or create a new ConcreteT instance within the ctx. More...

template<typename... Args>
static ConcreteT	getChecked (const Location &loc, Args &&...args)
	Get or create a new ConcreteT instance within the ctx, defined at the given, potentially unknown, location. More...

template<typename... Args>
static ConcreteT	getChecked (function_ref< InFlightDiagnostic()> emitErrorFn, MLIRContext *ctx, Args... args)
	Get or create a new ConcreteT instance within the ctx. More...

static ConcreteT	getFromOpaquePointer (const void *ptr)
	Get an instance of the concrete type from a void pointer. More...

Static Public Attributes
static constexpr StringLiteral	name = "quant.uniform_sub_channel"

Additional Inherited Members
Protected Member Functions inherited from mlir::detail::StorageUserBase< ConcreteT, BaseT, StorageT, UniquerT, Traits >
template<typename... Args>
LogicalResult	mutate (Args &&...args)
	Mutate the current storage instance. More...

Static Protected Member Functions inherited from mlir::detail::StorageUserBase< ConcreteT, BaseT, StorageT, UniquerT, Traits >
template<typename... Args>
static LogicalResult	verifyInvariants (Args... args)
	Default implementation that just returns success. More...

Detailed Description

Represents sub-channel (also known as blockwise quantization).

Syntax synopsis: UniformQuantizedSubChannelType ::= '!quant.uniform' '<' storageType ('<' storageMin ':' storageMax '>')? ':' expressedType ':' BlockSizeInfo ',' ScaleZeroTensor '>' BlockSizeInfo: '{' '}' | '{' AxisBlock (',' AxisBlock)* '}' AxisBlock ::= AxisSpec ':' BlockSizeSpec ScaleZeroTensor ::= ScaleZeroDenseExp | ScaleZeroList ScaleZeroDenseExp ::= '{' ScaleZeroTensor (',' ScaleZeroTensor)* '}' ScaleZeroList ::= ScaleZero (',' ScaleZero)* ScaleZero ::= Scale (':' ZeroPoint)?

StorageType: 'i'|'u' NumBits ExpressedType: 'f16', 'f32', 'bf16', 'f64' AxisSpec: An integer value BlockSizeSpec: An integer value Scale: An attribute (usually floating-point value) ZeroPoint: An attribute (usually integer value)

Definition at line 405 of file QuantTypes.h.

Member Typedef Documentation

◆ Base

using mlir::detail::StorageUserBase< ConcreteT, BaseT, StorageT, UniquerT, Traits >::Base = StorageUserBase<ConcreteT, BaseT, StorageT, UniquerT, Traits...>

Utility declarations for the concrete attribute class.

Definition at line 100 of file StorageUniquerSupport.h.

Member Function Documentation

◆ get()

UniformQuantizedSubChannelType UniformQuantizedSubChannelType::get	(	unsigned	flags,
		Type	storageType,
		Type	expressedType,
		DenseElementsAttr	scales,
		DenseElementsAttr	zeroPoints,
		ArrayRef< int32_t >	quantizedDimensions,
		ArrayRef< int64_t >	blockSizes,
		int64_t	storageTypeMin,
		int64_t	storageTypeMax
	)

static

Gets an instance of the type with all parameters specified but not checked.

Definition at line 413 of file QuantTypes.cpp.

References mlir::get(), and mlir::Type::getContext().

Referenced by mlirUniformQuantizedSubChannelTypeGet().

◆ getBlockSizeInfo()

const SmallVector< std::pair< int32_t, int64_t > > UniformQuantizedSubChannelType::getBlockSizeInfo ( ) const

Gets the block size information.

This returns a list of pairs, where each pair represents a quantized dimension and its corresponding block size.

For example, for the type: tensor<8x4x!quant.uniform<i8:f32:{1:2, 0:8}, {{2.0, 3.0}}>

This method returns: [(1, 2), (0, 8)]

This list indicates that axis 1 has a block size of 2, and axis 0 has a block size of 8.

Definition at line 518 of file QuantTypes.cpp.

Referenced by printUniformQuantizedSubChannelType().

◆ getBlockSizes()

ArrayRef< int64_t > UniformQuantizedSubChannelType::getBlockSizes ( ) const

Gets the block sizes for the quantized dimensions.

The i-th element in the returned list corresponds to the block size for the i-th dimension in the list returned by getQuantizedDimensions().

See getQuantizedDimensions() for more details and examples.

Definition at line 513 of file QuantTypes.cpp.

◆ getChecked() [1/3]

template<typename... Args>

static ConcreteT mlir::detail::StorageUserBase< ConcreteT, BaseT, StorageT, UniquerT, Traits >::getChecked ( typename... Args )

inlinestatic

Get or create a new ConcreteT instance within the ctx, defined at the given, potentially unknown, location.

If the arguments provided are invalid, errors are emitted using the provided location and a null object is returned.

Definition at line 189 of file StorageUniquerSupport.h.

◆ getChecked() [2/3]

UniformQuantizedSubChannelType UniformQuantizedSubChannelType::getChecked	(	function_ref< InFlightDiagnostic()>	emitError,
		unsigned	flags,
		Type	storageType,
		Type	expressedType,
		DenseElementsAttr	scales,
		DenseElementsAttr	zeroPoints,
		ArrayRef< int32_t >	quantizedDimensions,
		ArrayRef< int64_t >	blockSizes,
		int64_t	storageTypeMin,
		int64_t	storageTypeMax
	)

static

Gets an instance of the type with all specified parameters checked.

Returns a nullptr convertible type on failure.

Definition at line 423 of file QuantTypes.cpp.

References mlir::emitError(), and mlir::Type::getContext().

◆ getChecked() [3/3]

template<typename... Args>

static ConcreteT mlir::detail::StorageUserBase< ConcreteT, BaseT, StorageT, UniquerT, Traits >::getChecked ( typename... Args )

inlinestatic

Get or create a new ConcreteT instance within the ctx.

If the arguments provided are invalid, errors are emitted using the provided emitError and a null object is returned.

Definition at line 198 of file StorageUniquerSupport.h.

◆ getQuantizedDimensions()

ArrayRef< int32_t > UniformQuantizedSubChannelType::getQuantizedDimensions ( ) const

Gets the quantized dimensions.

Each element in the returned list represents an axis of the quantized data tensor that has a specified block size. The order of elements corresponds to the order of block sizes returned by getBlockSizes().

It means that the data tensor is quantized along the i-th dimension in the returned list using the i-th block size from getBlockSizes().

Note that the type expression does not have to specify the block size for all axes in the data tensor. Any unspecified block size for an axis i defaults to the tensor dimension size of that axis.

For example, for a quantized type: tensor<8x4x2x!quant.uniform<i8:f32:{1:2, 0:8}, {{1.0, 2.0}, {3.0, 4.0}}>

getQuantizedDimensions() returns [1, 0]. getBlockSizes() returns [2, 8].

This indicates that:

Axis 1 (second dimension) is quantized with a block size of 2.
Axis 0 (first dimension) is quantized with a block size of 8. Since axis 2 is not specified, it implicitly has a block size equal to the size of the third dimension (which is 2 in this case).

Definition at line 509 of file QuantTypes.cpp.

◆ getScales()

DenseElementsAttr UniformQuantizedSubChannelType::getScales ( ) const

Gets the quantization scales.

The scales are organized in a multi-dimensional tensor. The size of each dimension in the scales tensor is determined by the number of blocks along the corresponding dimension in the quantized data tensor.

For example, if the quantized data tensor has shape [X0, X1, ..., XR-1] and the block sizes are [B0, B1, ..., BR-1], then the scales tensor will have shape [X0/B0, X1/B1, ..., XR-1/BR-1].

The scale value for a specific element in the quantized data tensor at position [i0, i1, ..., iR-1] is determined by accessing the corresponding element in the scales tensor at position [i0/B0, i1/B1, ..., iR-1/BR-1].

Definition at line 500 of file QuantTypes.cpp.

Referenced by printUniformQuantizedSubChannelType().

◆ getZeroPoints()

DenseElementsAttr UniformQuantizedSubChannelType::getZeroPoints ( ) const

Gets the quantization zero-points.

The zero-points are organized in a multi-dimensional tensor. The size of each dimension in the zero-point tensor is determined by the number of blocks along the corresponding dimension in the quantized data tensor.

For example, if the quantized data tensor has shape [X0, X1, ..., XR-1] and the block sizes are [B0, B1, ..., BR-1], then the zero-point tensor will have shape [X0/B0, X1/B1, ..., XR-1/BR-1].

The zero-point value for a specific element in the quantized data tensor at position [i0, i1, ..., iR-1] is determined by accessing the corresponding element in the zero-point tensor at position [i0/B0, i1/B1, ..., iR-1/BR-1].

Definition at line 504 of file QuantTypes.cpp.

Referenced by printUniformQuantizedSubChannelType().

◆ verifyInvariants()

LogicalResult UniformQuantizedSubChannelType::verifyInvariants	(	function_ref< InFlightDiagnostic()>	emitError,
		unsigned	flags,
		Type	storageType,
		Type	expressedType,
		DenseElementsAttr	scales,
		DenseElementsAttr	zeroPoints,
		ArrayRef< int32_t >	quantizedDimensions,
		ArrayRef< int64_t >	blockSizes,
		int64_t	storageTypeMin,
		int64_t	storageTypeMax
	)

static

Verifies construction invariants and issues errors/warnings.

Definition at line 435 of file QuantTypes.cpp.

References mlir::emitError(), mlir::DenseElementsAttr::getType(), mlir::DenseElementsAttr::size(), and mlir::quant::QuantizedType::verifyInvariants().

Member Data Documentation

◆ name

constexpr StringLiteral mlir::quant::UniformQuantizedSubChannelType::name = "quant.uniform_sub_channel"

staticconstexpr

Definition at line 412 of file QuantTypes.h.

The documentation for this class was generated from the following files:

include/mlir/Dialect/Quant/IR/QuantTypes.h
lib/Dialect/Quant/IR/QuantTypes.cpp

Public Types

Public Member Functions

Static Public Member Functions

Static Public Attributes

Additional Inherited Members

Detailed Description

Member Typedef Documentation

◆ Base

Member Function Documentation

◆ get()

◆ getBlockSizeInfo()

◆ getBlockSizes()

◆ getChecked() [1/3]

◆ getChecked() [2/3]

◆ getChecked() [3/3]

◆ getQuantizedDimensions()

◆ getScales()

◆ getZeroPoints()

◆ verifyInvariants()

Member Data Documentation

◆ name