forked from deepseek-ai/DeepGEMM
-
Notifications
You must be signed in to change notification settings - Fork 25
Pull requests: sgl-project/DeepGEMM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
SM90 (Hopper) FP4 MegaMoE fused kernel with swapAB small-batch path
#53
opened Jun 29, 2026 by
qiushixiaoyu
Loading…
[SM90] Optimize FP8 MegaMoE small-batch decode path with swapAB
#48
opened Jun 22, 2026 by
qiushixiaoyu
Loading…
[sm100] Packed FP4×FP4 mega-MoE kernel (W4A4) with per-band BLOCK_K
#44
opened Jun 14, 2026 by
Romaosir
Loading…
Fix UE8M0 packing: mask mantissa bits when extracting fp32 exponents
#35
opened May 19, 2026 by
yhyang201
Loading…
ProTip!
Filter pull requests by the default branch with base:main.