gpu compatible write strategy, move compact strategy to use btrblocks with zstd and pco#6322
gpu compatible write strategy, move compact strategy to use btrblocks with zstd and pco#6322onursatici wants to merge 7 commits intodevelopfrom
Conversation
Signed-off-by: Onur Satici <onur@spiraldb.com>
CodSpeed Performance ReportMerging this PR will improve performance by 32.92%Comparing Summary
Performance Changes
Footnotes
|
| /// Configure a write strategy that emits only CUDA-compatible encodings. | ||
| /// | ||
| /// This keeps the default write layout pipeline, but: | ||
| /// - Restricts flat-layout normalization to [`GPU_ALLOWED_ENCODINGS`] | ||
| /// - Configures BtrBlocks to exclude schemes without CUDA kernel support | ||
| #[cfg(feature = "zstd")] | ||
| pub fn with_cuda_compatible_encodings(mut self) -> Self { | ||
| let btrblocks = BtrBlocksCompressorBuilder::default() | ||
| .exclude_int([IntCode::Sparse, IntCode::Rle]) | ||
| .exclude_float([FloatCode::AlpRd, FloatCode::Rle, FloatCode::Sparse]) | ||
| // Keep string schemes disabled in btrblocks; when `zstd` feature is enabled, we | ||
| // separately encode string/binary leaves as Zstd (without dictionaries). | ||
| .exclude_string([ | ||
| StringCode::Dict, | ||
| StringCode::Fsst, | ||
| StringCode::Constant, | ||
| StringCode::Sparse, | ||
| ]) | ||
| .build(); | ||
|
|
||
| self.compressor = Some(Arc::new(GpuCompatibleCompressor::new(btrblocks))); | ||
| self.allow_encodings = Some((*GPU_ALLOWED_ENCODINGS).clone()); | ||
| self | ||
| } |
There was a problem hiding this comment.
this should be done by the caller?
There was a problem hiding this comment.
it can, but now compact also doesn't use a custom compressor so it is useful to keep these as convenience imo
There was a problem hiding this comment.
its the cfg zstd new (is that the cuda zstd) or regular
Signed-off-by: Onur Satici <onur@spiraldb.com>
Benchmarks: PolarSignals ProfilingSummary
Detailed Results Table
|
Benchmarks: TPC-H SF=1 on NVMESummary
Detailed Results Table
|
Benchmarks: FineWeb NVMeSummary
Detailed Results Table
|
Benchmarks: TPC-DS SF=1 on NVMESummary
Detailed Results Table
|
Benchmarks: TPC-H SF=1 on S3Summary
Detailed Results Table
|
Benchmarks: TPC-H SF=10 on NVMESummary
Detailed Results Table
|
Benchmarks: FineWeb S3Summary
Detailed Results Table
|
Benchmarks: Statistical and Population GeneticsSummary
Detailed Results Table
|
Benchmarks: TPC-H SF=10 on S3Summary
Detailed Results Table
|
Benchmarks: Clickbench on NVMESummary
Detailed Results Table
|
| &SequenceScheme, | ||
| &RLE_INTEGER_SCHEME, | ||
| #[cfg(feature = "pco")] | ||
| &PcoScheme, |
There was a problem hiding this comment.
not must not be in the default
joseph-isaacs
left a comment
There was a problem hiding this comment.
make sure teh default strategy doesn't have PCO or zstd
|
default is filtering out pco and zstd |
| int_schemes: ALL_INT_SCHEMES | ||
| .iter() | ||
| .copied() | ||
| .filter(|s| s.code() != IntCode::Pco) |
There was a problem hiding this comment.
here is currently where we exclude pco and zstd from the default
| int_schemes: ALL_INT_SCHEMES | ||
| .iter() | ||
| .copied() | ||
| .filter(|s| s.code() != IntCode::Pco) |
There was a problem hiding this comment.
shall we have ALL_DEFAULT_SCEHEMS? next to all schemes?
There was a problem hiding this comment.
its just people will forget to omit other schemes
joseph-isaacs
left a comment
There was a problem hiding this comment.
Add the default schemes and we can merge
joseph-isaacs
left a comment
There was a problem hiding this comment.
need to fix the python test, this shouldn't change
File "../vortex-python/python/vortex/io.py", line ?, in default
Failed example:
os.path.getsize('tiny.vortex')
Expected:
55116
Got:
55316
No description provided.