Skip to content

How to Disable Concat Quantization Nodes, and how to Share the Same Quantization Parameters Across Multiple Layers #4101

Description

@piantou

(1) As shown in the figure below, I want to disable quantization for the specified Concat node. How should I write the corresponding code or configuration rules?

Image

(2) Taking the quantization tool TinyQ as an example, it can explicitly bind multiple operators through surgeon.connect to share the same set of quantization parameters (scale/zero_point). During operator fusion, only one set of QDQ nodes is retained, which reduces computation and accelerates inference.
Taking the Add operation as an example (shown in the figure below), to enable operator fusion and acceleration in the subsequent inference stage, it is required that the quantization parameters (scale) of all its input branches must be exactly the same. How can this be implemented in AIMET?

Image

Looking forward to your reply to clarify my questions. Thank you.

Metadata

Metadata

Labels

questionQuestion for AIMET team

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions