Skip to content

sonos/torch-to-nnef

Torch to NNEF

core CI LLM CI NeMo CI pythondocumentation MIT/Apache 2

Export any PyTorch model to NNEF, a simple and explicit neural network exchange format, in a way that is auditable, debuggable, and stable across runtimes.

torch_to_nnef traces a vanilla nn.Module (any internal tensor type, quantized models included) and produces a portable NNEF archive. Unlike the protobuf used by ONNX, the NNEF graph is human-readable text with weights kept aside as separate files, so you can open it and review exactly which ops got serialized. tract, the inference engine developed openly by SONOS, is the primary supported target: it gets extended operator coverage and an optional check_io step that compares tract outputs against PyTorch at export time.

🚀 See it run: live browser demos (image classifier, pose estimation, voice activity detection, LLM) exported with torch_to_nnef and running through tract compiled to WASM, no install required.

Quickstart

pip install torch_to_nnef   # needs pip >= 23.2
import torch
from torch_to_nnef import export_model_to_nnef, TractNNEF

model = torch.nn.Linear(8, 4).eval()
example_input = torch.randn(1, 8)

export_model_to_nnef(
    model=model,
    args=example_input,
    file_path_export="my_model.nnef.tgz",
    # version pins the tract opset to target; check_io downloads that tract
    # build once and asserts its output matches PyTorch on example_input.
    inference_target=TractNNEF(version="0.21.13", check_io=True),
    input_names=["input"],
    output_names=["output"],
)

This writes a self-contained my_model.nnef.tgz; with check_io=True the call also raises if tract and PyTorch outputs diverge, so reaching the end means the export is validated. The getting started tutorial (linked below) walks through running the archive in tract from Python, Rust, and the CLI.

What's inside my_model.nnef.tgz

The archive keeps the graph and the weights apart:

graph.nnef   weight.dat   bias.dat

and graph.nnef is plain text you can read end to end (metadata header elided):

graph network(input) -> (output)
{
    input = tract_core_external(shape = [1, 8], datum_type = 'f32');
    weight = variable<scalar>(label = 'weight', shape = [4, 8]);
    bias = variable<scalar>(label = 'bias', shape = [4]);
    bias_aligned_rank_expanded = unsqueeze(bias, axes = [0]);
    output = linear(input, weight, bias_aligned_rank_expanded);
}

What you can export

Any nn.Module whose forward takes and returns tensors can be exported out of the box, ordinary CNNs and transformers included. The tutorials below cover the cases that need more care:

Compatibility

PyTorch >= 1.10.0 on Linux and macOS, against the last two major tract releases (>= 0.20.22). Supported Python versions track those that are not end of life nor pre-release. Latest releases give the best operator coverage.

Learn more

The full documentation has step by step tutorials (from a first image classifier to LLM export), the export API reference, and the list of supported operators:

Contributing & support

Contributions are welcome. If you hit a bug, please follow the Bug report instructions so it can be reproduced quickly.

Licensed under MIT OR Apache-2.0.

Packages

 
 
 

Contributors

Languages