-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Pull requests: NVIDIA/cutlass
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[CuTeDSL] Flash Attention v2 for SM120 (Blackwell GeForce)
#3030
opened Feb 13, 2026 by
blake-snc
Loading…
minor: wrong cordinate in layout algebra docs section
#3014
opened Feb 10, 2026 by
JINO-ROHIT
Loading…
Declare CUDA standard 20 as requirement for example 63 (fixes #3011)
#3013
opened Feb 10, 2026 by
reuterbal
Loading…
Fix redundant tile copies in wgmma_sm90 tutorial pipeline loop
#2982
opened Jan 25, 2026 by
Johnsonms
Loading…
Fix error in Blackwell document of referring to Mxf4 format as NVF4
#2977
opened Jan 23, 2026 by
zianglih
Loading…
fix(examples): fix device compatibility check for Ada FP8 GEMM
inactive-30d
#2954
opened Jan 13, 2026 by
w1ndseeker
Loading…
Update profiler.md with how to use generator.py
inactive-30d
#2943
opened Jan 10, 2026 by
aidando73
Loading…
cutlass profiler - align emitted SFA/SFB kernel naming with typical convention
inactive-30d
#2942
opened Jan 10, 2026 by
aidando73
Loading…
Fix Warp Memory Access Arrangement in Epilogue: Upper Bound memory access width by output tile width
inactive-30d
#2938
opened Jan 8, 2026 by
lukas-ruettgers
Loading…
docs: Add FP16 GEMM documentation to sgemm_sm80.cu - Fixes #1686
inactive-30d
#2870
opened Dec 10, 2025 by
blueberrycongee
Loading…
[WIP]Unit tests for Kernels that perform BF16 x BF16 = MXFP8 and MXFP8 x MXFP8 = BF16
inactive-30d
#2857
opened Dec 8, 2025 by
Shreya-gaur
Loading…
use cp.async.bulk for per-row data; quiets synccheck
inactive-30d
#2850
opened Dec 5, 2025 by
v0i0
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.