How to Disable Concat Quantization Nodes, and how to Share the Same Quantization Parameters Across Multiple Layers

**(1)** As shown in the figure below, I want to disable quantization for the specified **Concat** node. How should I write the corresponding code or configuration rules?

<img width="491" height="659" alt="Image" src="https://github.com/user-attachments/assets/64ae62d9-1d08-4ff4-abed-da37ccf7b265" />

**(2)** Taking the quantization tool TinyQ as an example, it can explicitly bind multiple operators through **surgeon.connect** to share the same set of quantization parameters (scale/zero_point). During operator fusion, only one set of QDQ nodes is retained, which reduces computation and accelerates inference.
Taking the **Add** operation as an example (shown in the figure below), to enable operator fusion and acceleration in the subsequent inference stage, it is required that the quantization parameters (scale) of all its input branches must be exactly the same. **How can this be implemented in AIMET?**

<img width="287" height="578" alt="Image" src="https://github.com/user-attachments/assets/72d9ebec-5299-4381-b761-411b12bcacd5" />

Looking forward to your reply to clarify my questions. Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to Disable Concat Quantization Nodes, and how to Share the Same Quantization Parameters Across Multiple Layers #4101

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

How to Disable Concat Quantization Nodes, and how to Share the Same Quantization Parameters Across Multiple Layers #4101

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions