- API Overview & Cross-Reference β Complete module listing with function signatures and CUDA mapping
| Module | Doc | Description |
|---|---|---|
| driver | README | Device management, memory, kernel launch, streams, events |
| nvrtc | README | Runtime compilation of CUDA C++ to PTX / CUBIN |
| cublas | README | BLAS Level 1/2/3 (SAXPY, SGEMM, DGEMM, batched, mixed-precision) |
| cublaslt | README | Lightweight GEMM with algorithm heuristics |
| curand | README | GPU random number generation |
| cudnn | README | Convolution, activation, pooling, softmax, batch norm |
| cusolver | README | LU, QR, SVD, Cholesky, eigenvalue decomposition |
| cusparse | README | SpMV, SpMM, SpGEMM with CSR/COO formats |
| cufft | README | 1D/2D/3D Fast Fourier Transform |
| nvtx | README | Profiling annotations for NVIDIA Nsight |
| kernel | API | Kernel DSL β write CUDA kernels in pure Zig, compiled to PTX |
- Kernel DSL API Reference β intrinsics, shared memory, WMMA/MMA, TMA, cluster, tcgen05
- CUDA C++ β Zig Migration β port existing CUDA C++ kernels to pure Zig
- Examples Guide β 162 examples: 58 host (10 categories with per-category READMEs) + 80 kernel (11 categories) + 24 integration
- Project README β Quick start, build options, and project overview