A modular, scalable, high-performance training framework for LLMs, VLMs, diffusion, and embodied models.
-
Updated
Jun 17, 2026 - Python
A modular, scalable, high-performance training framework for LLMs, VLMs, diffusion, and embodied models.
Revisiting Mid-training in the Era of Reinforcement Learning Scaling
[𝗜𝗖𝗠𝗟 𝟮𝟬𝟮𝟲] Dispersion loss counteracts embedding condensation and improves generalization in small language models
The IDE for mid-training model development
Open catalog of datasets used to train and align LLMs across pretraining, mid-training, and post-training.
How Post-Training Shapes Biological Reasoning Models
Add a description, image, and links to the mid-training topic page so that developers can more easily learn about it.
To associate your repository with the mid-training topic, visit your repo's landing page and select "manage topics."