Write vortex-compact with Zstdbuffers instead of Zstd with unstable_encodings#8542
Write vortex-compact with Zstdbuffers instead of Zstd with unstable_encodings#8542myrrc wants to merge 1 commit into
Conversation
Merging this PR will not alter performance
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | chunked_varbinview_into_canonical[(1000, 10)] |
168.9 µs | 205.7 µs | -17.89% |
| ⚡ | Simulation | chunked_varbinview_canonical_into[(1000, 10)] |
191 µs | 154.6 µs | +23.52% |
| ⚡ | Simulation | bitwise_not_vortex_buffer_mut[128] |
244.4 ns | 215.3 ns | +13.55% |
| ⚡ | Simulation | chunked_varbinview_into_canonical[(100, 100)] |
306 µs | 271.7 µs | +12.66% |
| ⚡ | Simulation | bitwise_not_vortex_buffer_mut[1024] |
304.7 ns | 275.6 ns | +10.58% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing myrrc/bench-zstdbuffers-byte-length (bb05df0) with develop (5a764e6)
|
I've experimented with implementing ByteLength and CompareKernels, but this doesn't add any performance gains. |
f0c8795 to
fef6209
Compare
Signed-off-by: Mikhail Kot <mikhail@spiraldb.com>
fef6209 to
bb05df0
Compare
Polar Signals Profiling ResultsLatest Run
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals Profiling (base)Vortex (geomean): 0.958x ➖ How to read Verdict and Engines
datafusion / vortex-file-compressed (0.958x ➖, 1↑ 0↓)
No file size changes detected. |
Benchmarks: FineWeb NVMe (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.005x ➖, 1↑ 0↓)
datafusion / parquet (0.996x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.008x ➖, 0↑ 0↓)
duckdb / parquet (1.014x ➖, 0↑ 1↓)
File Size Changes (3 files changed, -46.3% overall, 1↑ 2↓)
Totals:
|
Benchmarks: TPC-H SF=1 on NVME (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.983x ➖, 0↑ 0↓)
datafusion / parquet (0.982x ➖, 1↑ 0↓)
datafusion / arrow (0.984x ➖, 1↑ 1↓)
duckdb / vortex-file-compressed (0.955x ➖, 2↑ 0↓)
duckdb / parquet (0.991x ➖, 2↑ 2↓)
File Size Changes (17 files changed, -44.3% overall, 5↑ 12↓)
Totals:
|
Benchmarks: TPC-DS SF=1 on NVME (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.905x ➖, 44↑ 0↓)
datafusion / parquet (0.920x ➖, 27↑ 0↓)
duckdb / vortex-file-compressed (0.913x ➖, 36↑ 1↓)
duckdb / parquet (0.946x ➖, 8↑ 0↓)
File Size Changes (30 files changed, -43.4% overall, 0↑ 30↓)
Totals:
|
Benchmarks: FineWeb S3 (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.915x ➖, 1↑ 0↓)
datafusion / parquet (1.116x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.103x ➖, 0↑ 1↓)
duckdb / parquet (1.026x ➖, 0↑ 0↓)
|
Benchmarks: Statistical and Population Genetics (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
duckdb / vortex-file-compressed (0.998x ➖, 0↑ 0↓)
duckdb / parquet (1.001x ➖, 0↑ 0↓)
File Size Changes (3 files changed, -32.3% overall, 1↑ 2↓)
Totals:
|
Benchmarks: TPC-H SF=10 on NVME (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.995x ➖, 0↑ 0↓)
datafusion / parquet (0.990x ➖, 0↑ 0↓)
datafusion / arrow (0.985x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (0.990x ➖, 0↑ 0↓)
duckdb / parquet (0.991x ➖, 0↑ 0↓)
File Size Changes (47 files changed, -44.4% overall, 10↑ 37↓)
Totals:
|
Benchmarks: Clickbench on NVME (base)Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.009x ➖, 0↑ 0↓)
datafusion / parquet (0.995x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (0.934x ➖, 12↑ 0↓)
duckdb / parquet (0.979x ➖, 2↑ 1↓)
File Size Changes (201 files changed, -39.1% overall, 50↑ 151↓)
Totals:
|
Benchmarks: TPC-H SF=1 on S3 (base)Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.071x ➖, 1↑ 4↓)
datafusion / parquet (1.008x ➖, 5↑ 6↓)
duckdb / vortex-file-compressed (1.039x ➖, 0↑ 1↓)
duckdb / parquet (1.132x ➖, 0↑ 2↓)
|
Benchmarks: PolarSignals ProfilingVortex (geomean): 1.011x ➖ How to read Verdict and Engines
datafusion / vortex-file-compressed (1.011x ➖, 0↑ 0↓)
No file size changes detected. |
🚨🚨🚨❌❌❌ SQL BENCHMARK FAILED ❌❌❌🚨🚨🚨Benchmark |
Benchmarks: FineWeb NVMeVerdict: Likely improvement (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.104x ❌, 0↑ 3↓)
datafusion / vortex-compact (0.493x ✅, 7↑ 2↓)
datafusion / parquet (1.075x ➖, 0↑ 4↓)
duckdb / vortex-file-compressed (1.064x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.513x ✅, 7↑ 1↓)
duckdb / parquet (1.064x ➖, 0↑ 2↓)
File Size Changes (2 files changed, +6.5% overall, 1↑ 1↓)
Totals:
|
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.977x ➖, 0↑ 0↓)
datafusion / vortex-compact (0.960x ➖, 1↑ 0↓)
datafusion / parquet (0.955x ➖, 3↑ 0↓)
datafusion / arrow (0.968x ➖, 1↑ 1↓)
duckdb / vortex-file-compressed (0.943x ➖, 3↑ 0↓)
duckdb / vortex-compact (0.888x ✅, 14↑ 0↓)
duckdb / parquet (0.993x ➖, 3↑ 2↓)
duckdb / duckdb (0.938x ➖, 1↑ 0↓)
File Size Changes (16 files changed, -1.3% overall, 9↑ 7↓)
Totals:
|
Benchmarks: FineWeb S3Verdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.807x ➖, 1↑ 0↓)
datafusion / vortex-compact (0.894x ➖, 1↑ 0↓)
datafusion / parquet (1.007x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.026x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.985x ➖, 0↑ 0↓)
duckdb / parquet (0.979x ➖, 0↑ 0↓)
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) How to read Verdict and Engines
duckdb / vortex-file-compressed (1.010x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.004x ➖, 0↑ 0↓)
duckdb / parquet (1.013x ➖, 0↑ 0↓)
File Size Changes (2 files changed, +0.1% overall, 2↑ 0↓)
Totals:
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.997x ➖, 0↑ 0↓)
datafusion / vortex-compact (0.985x ➖, 1↑ 0↓)
datafusion / parquet (0.998x ➖, 0↑ 0↓)
datafusion / arrow (0.993x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.003x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.986x ➖, 1↑ 0↓)
duckdb / parquet (0.997x ➖, 0↑ 0↓)
duckdb / duckdb (0.999x ➖, 0↑ 0↓)
File Size Changes (46 files changed, -1.2% overall, 16↑ 30↓)
Totals:
|
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.886x ➖, 3↑ 0↓)
datafusion / vortex-compact (0.981x ➖, 0↑ 1↓)
datafusion / parquet (0.919x ➖, 4↑ 2↓)
duckdb / vortex-file-compressed (0.980x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.067x ➖, 0↑ 0↓)
duckdb / parquet (1.005x ➖, 0↑ 0↓)
|
Benchmarks: Clickbench on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.039x ➖, 0↑ 2↓)
datafusion / parquet (1.024x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (0.956x ➖, 5↑ 0↓)
duckdb / parquet (0.993x ➖, 0↑ 0↓)
duckdb / duckdb (0.973x ➖, 0↑ 0↓)
File Size Changes (201 files changed, +6.1% overall, 153↑ 48↓)
Totals:
|
Benchmarks: Appian on NVMEVerdict: No clear signal (low confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (1.003x ➖, 0↑ 0↓)
datafusion / parquet (0.994x ➖, 0↑ 0↓)
duckdb / vortex-file-compressed (1.002x ➖, 0↑ 0↓)
duckdb / parquet (1.021x ➖, 0↑ 0↓)
duckdb / duckdb (0.996x ➖, 0↑ 0↓)
File Size Changes (12 files changed, +0.4% overall, 7↑ 5↓)
Totals:
|
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) How to read Verdict and Engines
datafusion / vortex-file-compressed (0.993x ➖, 0↑ 0↓)
datafusion / vortex-compact (1.049x ➖, 0↑ 2↓)
datafusion / parquet (0.923x ➖, 2↑ 1↓)
duckdb / vortex-file-compressed (1.111x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.055x ➖, 0↑ 1↓)
duckdb / parquet (1.106x ➖, 0↑ 0↓)
|
|
Wins: Fineweb: performance improvements on all queries up to 4x (q2), Losses:
cc @onursatici |
If unstable_encodings feature is set (CI as an example), register ZstdBuffers and not Zstd as default write strategy. This allows using byte_length without decompressing data in Zstd.
This brings down local Clickbench Q27 run from 450 to 190ms.
Resolves: #8541