Skip to content

Reduce CUDA build matrix, better fallback for lib loading#1980

Merged
matthewdouglas merged 1 commit into
mainfrom
reduce-cuda-matrix
Jun 22, 2026
Merged

Reduce CUDA build matrix, better fallback for lib loading#1980
matthewdouglas merged 1 commit into
mainfrom
reduce-cuda-matrix

Conversation

@matthewdouglas

Copy link
Copy Markdown
Member

Resolves #1778

Instead of failing when the exact CUDA/ROCm library version isn't packaged, bitsandbytes now selects the closest available version within the same major and logs a warning. Also fixes ROCm double-digit minor version parsing (e.g. for ROCm 7.13), reduces the CUDA build matrix to the most commonly deployed builds (i.e. those with official PyTorch wheels), and cleans up related diagnostics and error messaging.

This also reduces our wheel size. It is more relevant now after the recent addition of #1949 increased the binary sizes, but in fact brings our wheel much below v0.49.2 release as well.

Platform v0.49.2 main This PR
Linux x64 57.8 MiB 67.0 MiB 38.5 MiB
Linux aarch64 30.0 MiB 38.2 MiB 22.6 MiB
Windows x64 52.8 MiB 63.4 MiB 35.2 MiB

@github-actions

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@matthewdouglas matthewdouglas merged commit 9da7109 into main Jun 22, 2026
128 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reduce CUDA build matrix

1 participant