Use smallvec for internal stats storage#7823
Conversation
a10y
left a comment
There was a problem hiding this comment.
Do we think these allocations are significant?
|
I am not sure, trying things out to see benchmarks |
Merging this PR will degrade performance by 23.06%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ⚡ | Simulation | chunked_constant_i32_append_to_builder[(1000, 10)] |
40.6 µs | 30.1 µs | +34.78% |
| ⚡ | Simulation | chunked_bool_canonical_into[(1000, 10)] |
57.2 µs | 50.3 µs | +13.74% |
| ⚡ | Simulation | varbinview_large |
436.3 µs | 174.5 µs | ×2.5 |
| ❌ | Simulation | patched_take_10k_contiguous_patches |
228.6 µs | 288.2 µs | -20.66% |
| ❌ | Simulation | take_10k_contiguous |
270.9 µs | 329.1 µs | -17.69% |
| ❌ | Simulation | patched_take_10k_random |
241 µs | 300.1 µs | -19.7% |
| ❌ | Simulation | take_10k_random |
194.9 µs | 253.4 µs | -23.06% |
| ⚡ | Simulation | null_count_run_end[(10000, 1024, 0.01)] |
4 µs | 3.2 µs | +24.46% |
| ⚡ | Simulation | null_count_run_end[(10000, 256, 0.01)] |
4 µs | 3.2 µs | +25.61% |
| ⚡ | Simulation | null_count_run_end[(100000, 1024, 0.01)] |
4 µs | 3.2 µs | +25.61% |
| ❌ | Simulation | decompress_rd[f64, (10000, 0.1)] |
121.8 µs | 136.6 µs | -10.83% |
Comparing rk/statssmallvec (dbd087e) with develop (115b3ba)
Polar Signals Profiling ResultsLatest Run
Powered by Polar Signals Cloud |
Benchmarks: PolarSignals ProfilingVortex (geomean): 1.015x ➖ datafusion / vortex-file-compressed (1.015x ➖, 0↑ 1↓)
|
File Sizes: PolarSignals ProfilingNo file size changes detected. |
Benchmarks: FineWeb NVMeVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (1.047x ➖, 0↑ 2↓)
datafusion / vortex-compact (1.029x ➖, 0↑ 1↓)
datafusion / parquet (1.015x ➖, 0↑ 1↓)
duckdb / vortex-file-compressed (1.006x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.990x ➖, 0↑ 0↓)
duckdb / parquet (1.025x ➖, 0↑ 1↓)
Full attributed analysis
|
File Sizes: FineWeb NVMeNo file size changes detected. |
Benchmarks: TPC-H SF=1 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.863x ✅, 20↑ 0↓)
datafusion / vortex-compact (0.886x ✅, 17↑ 0↓)
datafusion / parquet (0.876x ✅, 14↑ 0↓)
datafusion / arrow (0.850x ✅, 15↑ 0↓)
duckdb / vortex-file-compressed (0.837x ✅, 21↑ 0↓)
duckdb / vortex-compact (0.871x ✅, 18↑ 0↓)
duckdb / parquet (0.911x ➖, 9↑ 0↓)
duckdb / duckdb (0.871x ✅, 16↑ 0↓)
Full attributed analysis
|
File Sizes: TPC-H SF=1 on NVMEFile Size Changes (195 files changed, -98.4% overall, 0↑ 195↓)
Totals:
|
Benchmarks: TPC-DS SF=1 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.962x ➖, 3↑ 0↓)
datafusion / vortex-compact (0.982x ➖, 1↑ 1↓)
datafusion / parquet (0.983x ➖, 2↑ 2↓)
duckdb / vortex-file-compressed (0.981x ➖, 4↑ 2↓)
duckdb / vortex-compact (0.978x ➖, 0↑ 1↓)
duckdb / parquet (0.990x ➖, 3↑ 1↓)
duckdb / duckdb (0.961x ➖, 3↑ 0↓)
Full attributed analysis
|
File Sizes: TPC-DS SF=1 on NVMENo file size changes detected. |
Benchmarks: FineWeb S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.976x ➖, 0↑ 0↓)
datafusion / vortex-compact (1.115x ➖, 0↑ 1↓)
datafusion / parquet (0.910x ➖, 1↑ 0↓)
duckdb / vortex-file-compressed (0.970x ➖, 0↑ 0↓)
duckdb / vortex-compact (1.056x ➖, 0↑ 0↓)
duckdb / parquet (0.978x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: Statistical and Population GeneticsVerdict: No clear signal (low confidence) duckdb / vortex-file-compressed (0.999x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.984x ➖, 0↑ 0↓)
duckdb / parquet (0.999x ➖, 0↑ 0↓)
Full attributed analysis
|
File Sizes: Statistical and Population GeneticsNo file size changes detected. |
Benchmarks: Random AccessVortex (geomean): 0.949x ➖ unknown / unknown (0.972x ➖, 4↑ 0↓)
|
Benchmarks: TPC-H SF=10 on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.924x ➖, 9↑ 0↓)
datafusion / vortex-compact (0.915x ➖, 8↑ 0↓)
datafusion / parquet (0.938x ➖, 2↑ 0↓)
datafusion / arrow (0.914x ➖, 8↑ 0↓)
duckdb / vortex-file-compressed (0.965x ➖, 2↑ 0↓)
duckdb / vortex-compact (0.956x ➖, 1↑ 0↓)
duckdb / parquet (0.970x ➖, 1↑ 0↓)
duckdb / duckdb (0.963x ➖, 1↑ 0↓)
Full attributed analysis
|
File Sizes: TPC-H SF=10 on NVMENo file size changes detected. |
Benchmarks: Clickbench on NVMEVerdict: No clear signal (low confidence) datafusion / vortex-file-compressed (0.957x ➖, 1↑ 1↓)
datafusion / parquet (0.958x ➖, 1↑ 0↓)
duckdb / vortex-file-compressed (0.959x ➖, 4↑ 1↓)
duckdb / parquet (0.971x ➖, 0↑ 0↓)
duckdb / duckdb (0.979x ➖, 1↑ 0↓)
Full attributed analysis
|
File Sizes: Clickbench on NVMEFile Size Changes (1 files changed, -0.0% overall, 0↑ 1↓)
Totals:
|
Benchmarks: TPC-H SF=1 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.940x ➖, 1↑ 1↓)
datafusion / vortex-compact (0.895x ➖, 2↑ 0↓)
datafusion / parquet (0.889x ➖, 1↑ 0↓)
duckdb / vortex-file-compressed (0.945x ➖, 0↑ 0↓)
duckdb / vortex-compact (0.958x ➖, 0↑ 0↓)
duckdb / parquet (0.962x ➖, 0↑ 0↓)
Full attributed analysis
|
Benchmarks: CompressionVortex (geomean): 1.006x ➖ unknown / unknown (1.004x ➖, 2↑ 5↓)
|
|
looks like microbenchmark wise this is a win since we no longer have indirect vec allocation and we batch it in the owner arc |
Benchmarks: TPC-H SF=10 on S3Verdict: No clear signal (environment too noisy confidence) datafusion / vortex-file-compressed (0.910x ➖, 1↑ 0↓)
datafusion / vortex-compact (0.923x ➖, 0↑ 0↓)
datafusion / parquet (0.923x ➖, 1↑ 0↓)
duckdb / vortex-file-compressed (0.851x ➖, 2↑ 0↓)
duckdb / vortex-compact (0.963x ➖, 0↑ 0↓)
duckdb / parquet (0.892x ➖, 0↑ 0↓)
Full attributed analysis
|
Signed-off-by: Robert Kruszewski <github@robertk.io>
14e3931 to
3881650
Compare
|
I made the smallvec be 4 elements in size. empirically we only ever populate more if we write or if there's multiple operations |
Instead of always allocating fixed sized vec we can use a smallvec instead
Signed-off-by: Robert Kruszewski github@robertk.io