Checklist
Motivation
Adapt the Jet-Nemotron-2B model. It adopts Post Neural Architecture Search, an efficient post-training architecture exploration and adaptation pipeline applicable to any pre-trained Transformer model. Its linear module JetBlock is a novel linear attention module, whose performance significantly outperforms previous designs such as Mamba2.
Related resources
No response
Checklist
Motivation
Adapt the Jet-Nemotron-2B model. It adopts Post Neural Architecture Search, an efficient post-training architecture exploration and adaptation pipeline applicable to any pre-trained Transformer model. Its linear module JetBlock is a novel linear attention module, whose performance significantly outperforms previous designs such as Mamba2.
Related resources
No response