Skip to content

How to Export an ONNX Model with Dynamic Computation? #42

Description

@onePlusLin

I downloaded the YOLO-Master-v0.1-N.onnx model from the official website,which link is https://github.com/Tencent/YOLO-Master/releases/download/YOLO-Master-v26.02/YOLO-Master-v0.1-N.onnx . But I noticed that the number of experts inside the model becomes smaller. Taking the initial router structure, model.6, as an example: at the beginning, during top-k selection, it would choose the indices of the top two experts. However, later during dynamic computation, the equal operation only checks whether the index equals 2 or 3, instead of checking whether it equals 0, 1, 2, or 3. It seems that the router structure got frozen when PyTorch exported the model to ONNX.

I looked into the reason, and it appears to be because torch.onnx.export uses tracing mode. When it exports using a single computation trace, two of the experts were not used, so they were never traced into the computation graph, resulting in a final model that may only have two experts.

I saw in your deployment notes that you mentioned using a dense approach — that is, computing all experts first and then selecting the outputs according to the indices. Are there any other methods besides this?

I also tried using dynamo_export, but that requires a higher version of PyTorch. I attempted torch.jit.script, but it can only script part of the structure and cannot script the whole model. Finally, I used torch.cond from PyTorch 2.x to replace if mask.any(), which seems to be capturable by torch.export and can be converted into an ONNX If operator! Could you kindly let me know if there are any good approaches? I would also like to convert the model into a TFLite model for quantization. None of the methods above, except the dense computation approach, can be successfully converted to TFLite — especially because the ONNX If operator is not supported by TFLite!

If someone could answer me, I would be extremely grateful.

Below is the ONNX graph of the original model:

Image

The truly dynamic ONNX graph? :

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions