Skip to content

Codex searchshortqa deepspeed#791

Open
daiweidaichenzi wants to merge 20 commits into
jingyaogong:masterfrom
daiweidaichenzi:codex-searchshortqa-deepspeed
Open

Codex searchshortqa deepspeed#791
daiweidaichenzi wants to merge 20 commits into
jingyaogong:masterfrom
daiweidaichenzi:codex-searchshortqa-deepspeed

Conversation

@daiweidaichenzi

Copy link
Copy Markdown

No description provided.

daiweidaichenzi and others added 20 commits May 8, 2026 20:44
- Add model_minimind_mla.py with Q/KV low-rank compression and decoupled RoPE
- Add kv_norm on kv_latent, q_compress + q_up + q_rope_proj for Q path
- Cache format: (kv_latent, k_rope) tuple, 176 floats/tok/layer (4.4x compression)
- Add --use_mla flag to all training scripts and utils
- Add benchmark_gqa_vs_mla.py script
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants