Repository Issue Activity (beta)

deepseek-ai/FlashMLA

Current issue state, recent activity, and per-issue timelines from the indexed issue data.

Back to repository page View GitHub issues

Open Issues

68

New in 7 Days

0

Closed in 7 Days

0

Average Open Age

262 days

Stale 30+ Days

66

Stale 90+ Days

62

Last 2 Weeks

Date	Comments	Events	Open Backlog
2026-07-26	0	0	0
2026-07-25	0	0	0
2026-07-24	0	0	0
2026-07-23	0	0	0
2026-07-22	0	0	0
2026-07-21	0	0	0
2026-07-20	0	0	0
2026-07-19	0	0	0
2026-07-18	0	0	0
2026-07-17	0	0	0
2026-07-16	0	0	0
2026-07-15	0	0	0
2026-07-14	0	0	0
2026-07-13	1	1	2

This Week

Opened: 0

Closed: 0

Comments: 0

Events: 0

Top Labels

No label distribution is available yet.

Issue Explorer

Search title or body

Author

State

Label

Sort

Issue	Author	State	Labels	Comments	Reactions	Updated
#169 dual gemm Opened 5 months ago	WalrusWTQ	open	No labels	2	0	13 days ago
#172 Use `pyproject.toml` Opened 4 months ago	sakgoyal	open	No labels	1	2	16 days ago
#192 Sparse MLA decode (V3.2 / FP8 KV): throughput cost of the B200 accuracy fix (5aa668c) scales with topk Opened 29 days ago	MogicianWu	open	No labels	0	0	29 days ago
#190 flash mla kernal是否支持同一batch下不同query的动态token数？ Opened 1 month ago	echo-timeless	open	No labels	0	1	1 month ago
#179 Do you open to add SM80 support Opened 3 months ago	haosdent	open	No labels	1	2	1 month ago
#66 Ampere architecture FlashMLA bring-up Opened 1 year ago	pzhao-eng	open	No labels	2	9	2 months ago
#180 Make FlashMLA a libtorch and cpython stable extension Opened 3 months ago	janeyx99	open	No labels	0	0	3 months ago
#149 [Question] DSA VS MLA Prefill Benchmark On H100 Opened 6 months ago	ZavierXing	open	No labels	1	0	4 months ago
#171 CUTLASS Internal Error during run on SM100 with long sequences (seq_len=1M) Opened 4 months ago	For-rest2005	closed - completed	No labels	0	0	4 months ago
#168 flash_mla.flash_mla_with_kvcache have plans to support FP8 for the q data in the future or FP8 insufficient to support model accuracy? Opened 5 months ago	zhangfengwei	open	No labels	0	0	5 months ago
#166 [Question]Questions regarding INTERLEAVE vs. SW128 Layouts for SM90 Sparse Attention Decode Opened 5 months ago	pengwubj	open	No labels	0	0	5 months ago
#125 Can MHA be used in the DSA prefill stage? Opened 8 months ago	starwang1024	open	No labels	1	1	5 months ago
#165 [Question] why not using FP8 rope in sparse FP8 decoding? Opened 5 months ago	lyppg	open	No labels	0	0	5 months ago
#164 mla Opened 5 months ago	TZWX-0	open	No labels	0	0	5 months ago
#161 [CUDA/CUTLASS] Improvements for varlen option derivation, input validation, can_implement errors, and workspace handling Opened 6 months ago	red1239109-cmd	open	No labels	0	0	6 months ago
#159 RuntimeError: CUBLAS_STATUS_INVALID_VALUE in ref_sparse_attn_decode on H800 (Hopper) Opened 6 months ago	Socratesa	closed - completed	No labels	1	0	6 months ago
#155 [Question] What is MODEL1? Opened 6 months ago	gary-wjc	open	No labels	7	27	6 months ago
#158 [Bug/Correctness] Hardcoded device_id=0 + missing CUDAGuard can break multi-GPU correctness (wrong hw_info / stream mismatch) Opened 6 months ago	red1239109-cmd	open	No labels	2	0	6 months ago
#153 Compilation Error: exceeds maximum register limit. Opened 6 months ago	GrateVoyage	closed - completed	No labels	2	0	6 months ago
#154 Missing the included header file: #include <span> Opened 6 months ago	GrateVoyage	closed - completed	No labels	2	0	6 months ago
#148 FlashMLA for Blackwell architecture (B200) Opened 7 months ago	abdul7mohsen	open	No labels	0	1	7 months ago
#126 does flash_mla_with_kvcache work only in paged mode? Opened 7 months ago	vince62s	open	No labels	0	0	7 months ago
#119 Built FlashMLA for Windows and Nvidia sm_120 (workstation/50s) cards Opened 9 months ago	IISuperluminaLII	open	No labels	4	0	7 months ago
#124 support for rtx 6000 sm120 Opened 8 months ago	fernandaspets	open	No labels	0	1	8 months ago
#121 Build error on cuda13 arm64 Opened 8 months ago	icavanyu	open	No labels	1	0	8 months ago