perf: optimize native shuffle struct field processing with field-major order by andygrove · Pull Request #3224 · apache/datafusion-comet

andygrove · 2026-01-20T19:20:15Z

Summary

Optimizes struct field processing in native shuffle by using field-major instead of row-major order. This moves type dispatch from O(rows × fields) to O(fields), eliminating per-row type matching overhead.

The problem:
Previously, for each row we iterated over all fields and called append_field() which did a type match for EVERY field in EVERY row. For a struct with N fields and M rows, that's N×M type matches where the types never change.

// Old approach - row-major order
for row in rows {                           // M rows
    for (idx, field) in fields.iter() {     // N fields
        append_field(field.data_type(), ...);  // Type match happens here
    }
}
// Total: M × N type matches

The solution:
Field-major processing with two passes:

First pass: Loop over rows, build struct validity
Second pass: For each field, get typed builder once, then process all rows for that field

// New approach - field-major order
// Pass 1: Build struct validity
for row in rows {
    struct_builder.append(is_valid);
}

// Pass 2: Process fields
for (field_idx, field) in fields.iter() {   // N fields
    match field.data_type() {                // Type match ONCE per field
        DataType::Int32 => {
            let builder = struct_builder.field_builder::<Int32Builder>(field_idx);
            for row in rows {                // M rows
                builder.append_value(...);   // No type match
            }
        }
        // ... other types
    }
}
// Total: N type matches

This reduces type dispatch from O(rows × fields) to O(fields).

For complex nested types (struct, list, map), falls back to existing append_field since they have their own recursive processing logic.

Benchmark Results

Benchmark for converting Spark UnsafeRow with struct columns to Arrow arrays:

Test Case	Baseline	Optimized	Speedup
fields_5/rows_1000	76.8µs	63.0µs	1.22x
fields_5/rows_10000	384µs	238µs	1.61x
fields_10/rows_1000	110µs	85.0µs	1.30x
fields_10/rows_10000	699µs	429µs	1.63x
fields_20/rows_1000	183µs	128µs	1.43x
fields_20/rows_10000	1347µs	870µs	1.55x

The optimization shows 1.2x-1.6x speedup, with larger benefits for:

More rows (larger batches benefit more from reduced per-row overhead)
More struct fields (more fields = more type dispatches avoided)

Test plan

All Rust tests pass (115 tests)
Native shuffle tests pass (16 tests)
Fuzz tests pass (120 tests)
Clippy clean
Added benchmark (native/core/benches/struct_conversion.rs)

🤖 Generated with Claude Code

Optimize struct field processing in native shuffle by using field-major instead of row-major order. This moves type dispatch from O(rows × fields) to O(fields), eliminating per-row type matching overhead. Previously, for each row we iterated over all fields and called `append_field()` which did a type match for EVERY field in EVERY row. For a struct with N fields and M rows, that's N×M type matches. The new approach: 1. First pass: Loop over rows, build struct validity 2. Second pass: For each field, get typed builder once, then process all rows for that field This keeps type dispatch at O(fields) instead of O(rows × fields). For complex nested types (struct, list, map), falls back to existing `append_field` since they have their own recursive processing logic. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

andygrove · 2026-01-20T19:24:24Z

@sqlbenchmark run tpch

Add a Criterion benchmark to measure the performance of struct column processing in native shuffle. Tests various struct sizes (5, 10, 20 fields) and row counts (1K, 10K rows). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

codecov-commenter · 2026-01-20T20:03:38Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 60.02%. Comparing base (f09f8af) to head (f8fe7ba).
⚠️ Report is 858 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #3224      +/-   ##
============================================
+ Coverage     56.12%   60.02%   +3.89%     
- Complexity      976     1429     +453     
============================================
  Files           119      170      +51     
  Lines         11743    15746    +4003     
  Branches       2251     2602     +351     
============================================
+ Hits           6591     9451    +2860     
- Misses         4012     4976     +964     
- Partials       1140     1319     +179

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sqlbenchmark · 2026-01-20T20:20:48Z

Comet TPC-H Benchmark Results

Commit: f8fe7ba - fix: remove unused import and mut in benchmark
Scale Factor: SF100
Iterations: 1

Query Times

Query	Time (s)	Query	Time (s)
Q1	10.75	Q12	6.83
Q2	5.88	Q13	6.74
Q3	9.75	Q14	3.66
Q4	11.43	Q15	7.35
Q5	19.18	Q16	4.88
Q6	2.55	Q17	32.45
Q7	11.93	Q18	33.35
Q8	24.34	Q19	6.76
Q9	37.60	Q20	6.78
Q10	10.32	Q21	45.96
Q11	4.43	Q22	5.02

Total Time: 307.94 seconds

Spark Configuration

Setting	Value
Spark Master	`local[*]`
Driver Memory	32G
Driver Cores	8
Executor Memory	32G
Executor Cores	8
Off-Heap Enabled	true
Off-Heap Size	24g
Shuffle Manager	`CometShuffleManager`
Comet Replace SMJ	true

Automated benchmark run by dfbench

sqlbenchmark · 2026-01-20T21:06:06Z

Comet TPC-H Benchmark Results

Commit: f8fe7ba - fix: remove unused import and mut in benchmark
Scale Factor: SF100
Iterations: 1

Query Times

Query	Time (s)	Query	Time (s)
Q1	10.69	Q12	7.47
Q2	5.81	Q13	7.69
Q3	9.57	Q14	3.78
Q4	11.16	Q15	7.87
Q5	18.49	Q16	5.10
Q6	2.50	Q17	35.29
Q7	11.85	Q18	35.93
Q8	24.84	Q19	7.44
Q9	39.55	Q20	7.08
Q10	10.86	Q21	49.13
Q11	4.62	Q22	5.16

Total Time: 321.87 seconds

Spark Configuration

Setting	Value
Spark Master	`local[*]`
Driver Memory	32G
Driver Cores	8
Executor Memory	32G
Executor Cores	8
Off-Heap Enabled	true
Off-Heap Size	24g
Shuffle Manager	`CometShuffleManager`
Comet Replace SMJ	true

Automated benchmark run by dfbench

andygrove · 2026-01-26T16:08:34Z

replaced with #3289 so that I can work on all related optimizations in a single PR for now

andygrove and others added 2 commits January 20, 2026 12:37

test: add benchmark for struct column processing

bbe2da7

Add a Criterion benchmark to measure the performance of struct column processing in native shuffle. Tests various struct sizes (5, 10, 20 fields) and row counts (1K, 10K rows). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

fix: remove unused import and mut in benchmark

f8fe7ba

andygrove mentioned this pull request Jan 20, 2026

perf: extend field-major processing to nested struct fields #3225

Open

andygrove marked this pull request as ready for review January 20, 2026 21:58

andygrove changed the title ~~perf: optimize struct field processing with field-major order~~ perf: optimize native shuffle struct field processing with field-major order Jan 20, 2026

andygrove modified the milestone: 0.13.0 Jan 22, 2026

andygrove requested review from comphead and wForget January 25, 2026 13:00

andygrove closed this Jan 26, 2026

andygrove deleted the optimize-row-dispatch branch January 26, 2026 16:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: optimize native shuffle struct field processing with field-major order#3224

perf: optimize native shuffle struct field processing with field-major order#3224
andygrove wants to merge 3 commits intoapache:mainfrom
andygrove:optimize-row-dispatch

andygrove commented Jan 20, 2026 •

edited

Loading

Uh oh!

andygrove commented Jan 20, 2026

Uh oh!

codecov-commenter commented Jan 20, 2026 •

edited

Loading

Uh oh!

sqlbenchmark commented Jan 20, 2026

Uh oh!

sqlbenchmark commented Jan 20, 2026

Uh oh!

andygrove commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

andygrove commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Benchmark Results

Test plan

Uh oh!

andygrove commented Jan 20, 2026

Uh oh!

codecov-commenter commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sqlbenchmark commented Jan 20, 2026

Comet TPC-H Benchmark Results

Query Times

Uh oh!

sqlbenchmark commented Jan 20, 2026

Comet TPC-H Benchmark Results

Query Times

Uh oh!

andygrove commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

andygrove commented Jan 20, 2026 •

edited

Loading

codecov-commenter commented Jan 20, 2026 •

edited

Loading