An open implementation of DiPaCo (Distributed Path Composition) for modular, low-communication language-model training — validated on GPU.
deep-learning pytorch language-models distributed-training mixture-of-experts llm diloco dipaco low-communication-training
-
Updated
Jun 17, 2026 - Python