Skip to content

Pull requests: vllm-project/flash-attention

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Fully yank dropout in vllm fork of FA2
#153 opened Jun 23, 2026 by janeyx99 Loading…
feat(fa4): fuse per-group fp8 output
#151 opened Jun 18, 2026 by carlyou Loading…
Sync upstream
#149 opened Jun 16, 2026 by MatthewBonanni Member Loading…
[Fix] Mark kcache and vcache as mutable in fwd_kvcache
#148 opened Jun 15, 2026 by remi-or Loading…
SM100 dynamic causal
#144 opened Jun 10, 2026 by MatthewBonanni Member Loading…
Reapply #122
#137 opened May 6, 2026 by MatthewBonanni Member Loading…
SM100 tile size 64
#132 opened Apr 9, 2026 by MatthewBonanni Member Draft
add support for newer CUDA archs (Spark/Thor)
#121 opened Feb 13, 2026 by askliar Loading…
Fix issues with async TP
#117 opened Feb 7, 2026 by LucasWilkinson Collaborator Loading…
Sync to upstream main 20260121
#114 opened Jan 22, 2026 by LucasWilkinson Collaborator Loading…
FA2 support head sizes 40, 72, and 80
#108 opened Nov 14, 2025 by MatthewBonanni Member Draft
Add DCP parameters
#92 opened Sep 16, 2025 by MatthewBonanni Member Draft
Vllm_flash_attn_with_attention_weights
#88 opened Sep 11, 2025 by SiriusPaul Loading…
WIP stream k scheduling
#67 opened Apr 29, 2025 by LucasWilkinson Collaborator Draft
fix: add "typename" prior to dependent type name
#54 opened Feb 28, 2025 by zhiweij1 Loading…
AMD ROCm Build
#41 opened Jan 29, 2025 by ProExpertProg Draft
Add back flash_attn_func api (and support FA3) [Don't Merge Yet]
#40 opened Jan 26, 2025 by LucasWilkinson Collaborator Loading…
ProTip! Adding no:label will show everything without a label.