mid-training

Star

Here are 6 public repositories matching this topic...

baidu-baige / LoongForge

Star

A modular, scalable, high-performance training framework for LLMs, VLMs, diffusion, and embodied models.

training ai distributed infra lora wan diffusion vla vlm dit sft megatron pretraining llm mid-training

Updated Jun 17, 2026
Python

GAIR-NLP / OctoThinker

Star

Revisiting Mid-training in the Era of Reinforcement Learning Scaling

rl llama reasoning post-training pre-training llm qwen verl mid-training

Updated Jul 23, 2025
Jupyter Notebook

ChenLiu-1996 / LM-Dispersion

Star

[𝗜𝗖𝗠𝗟 𝟮𝟬𝟮𝟲] Dispersion loss counteracts embedding condensation and improves generalization in small language models

dispersion cosine-similarity embedding manifold-learning icml condensation latent-space pre-training embedding-vectors dispersive large-language-models llm geometric-learning llms llm-training small-language-models mid-training icml-2026 embedding-condensation

Updated Jun 1, 2026
Python

GrayboxTech / weightslab

Star

The IDE for mid-training model development

python open-source machine-learning framework deep-learning ide devtools pytorch dataset neural-networks tensorboard model-training model-development mlops model-debugging training-visualization ml-tools mid-training

Updated Jun 18, 2026
Python

Shekswess / open-corpus-registry

Star

Open catalog of datasets used to train and align LLMs across pretraining, mid-training, and post-training.

data datasets datahub post-training pretraining llm mid-training

Updated Jan 6, 2026
Python

mims-harvard / bio-posttrain

Star

How Post-Training Shapes Biological Reasoning Models

post-training reasoning-models mid-training multimodal-foundation-models biological-foundation-models

Updated Jun 16, 2026
Jupyter Notebook

Improve this page

Add a description, image, and links to the mid-training topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mid-training topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mid-training

Here are 6 public repositories matching this topic...

baidu-baige / LoongForge

GAIR-NLP / OctoThinker

ChenLiu-1996 / LM-Dispersion

GrayboxTech / weightslab

Shekswess / open-corpus-registry

mims-harvard / bio-posttrain

Improve this page

Add this topic to your repo