feat(python): package CUDA as an optional extension#8510
Conversation
|
@robert3005 & @gatesn Lemme know if this directionally makes sense. |
Merging this PR will improve performance by 11.93%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Simulation | bitwise_not_vortex_buffer_mut[128] |
273.6 ns | 244.4 ns | +11.93% |
Tip
Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.
Comparing ad/pyvortex-cudf (c8ecde6) with develop (d0d400c)
Footnotes
-
4 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
671cae2 to
c750545
Compare
cuda_enabled and vortex-data-cudabb250db to
5df00e3
Compare
|
@claude have a look. Is there a better way of setting version in pyproject.toml? |
|
Claude review automation is disabled for pull requests that modify Why:
Ask a human reviewer to inspect workflow changes directly. |
robert3005
left a comment
There was a problem hiding this comment.
Only the version rewrite is a nit
Keep the base vortex-data wheel CPU-only and model CUDA support as a separate vortex-data-cuda extension package imported as vortex_cuda. The base package exposes vortex.cuda_extension_installed() to check whether the optional extension is importable, while vortex_cuda.cuda_available() performs the runtime CUDA driver/device probe. Wire vortex-data[cuda] to depend on the exact matching vortex-data-cuda version, and have the CUDA package depend on the exact matching vortex-data version. Release automation updates those pins with the workspace version, but this change intentionally does not add CUDA wheel publishing jobs yet. Keep vortex-python-cuda out of the uv workspace so uv --all-packages on non-GPU CI does not build the CUDA crate chain. Exercise it through the GPU CUDA workflow instead, and exclude the thin PyO3 shim from broad workspace doc/doctest/instrumented jobs where it has no Rust tests or doctests. Embed generated PTX into vortex-cuda when nvcc is available so installed wheels do not depend on build-machine kernel file paths at runtime. Runtime kernel loading is embedded-only; the old VORTEX_CUDA_KERNELS_DIR path override and Codspeed kernel staging were removed. Make cuda_available() resilient on driverless hosts by probing libcuda with cudarc before creating a CUDA context, so CPU-only machines return false instead of panickingq Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>
Keeps the base vortex-data wheel CPU-only and model CUDA support as a separate
vortex-data-cudaextension package imported asvortex_cuda. The base package exposesvortex.cuda_extension_installed()to check whether the extension is importable, whilevortex_cuda.cuda_available()performs the runtime CUDA probe.Wires
vortex-data[cuda]to depend on the exact matchingvortex-data-cudaversion, and have the CUDA package depend on the exact matchingvortex-dataversion. Release automation updates those pins with the workspace version, but this change intentionally does not add CUDA wheel publishing jobs yet.Keeps
vortex-python-cudaout of the uv workspace souv --all-packageson non-GPU CI does not build the CUDA crate chain. Add explicit CUDA Python checks to the GPU workflow instead, including the uv setup needed to run them.Embeds generated PTX into
vortex-cudawhen available so installed wheels do not depend on build-machine kernel file paths at runtime.