Commit cbc046e
feat: two-generation early emission for partial aggregation
When the partial aggregate's hash table exceeds a configurable size
threshold (default: 4MB), use a two-generation scheme to emit
intermediate state while keeping the hash table cache-friendly.
When the hot hash table fills up:
1. Emit the cold batch (previous generation's state) downstream
2. Promote the current hot table state to the cold batch
3. Reset the hot hash table and continue reading
This gives recurring groups a second chance to be merged locally
before being sent downstream, reducing the number of partial
emissions through the hash repartition while keeping the working
set in CPU cache.
At end-of-input, the remaining hot state and cold batch are
concatenated and emitted together.
New config: datafusion.execution.partial_aggregation_max_table_size
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 0143dfe commit cbc046e
3 files changed
Lines changed: 81 additions & 2 deletions
File tree
- datafusion
- common/src
- physical-plan/src/aggregates
- sqllogictest/test_files
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
639 | 639 | | |
640 | 640 | | |
641 | 641 | | |
| 642 | + | |
| 643 | + | |
| 644 | + | |
| 645 | + | |
| 646 | + | |
| 647 | + | |
| 648 | + | |
| 649 | + | |
642 | 650 | | |
643 | 651 | | |
644 | 652 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
437 | 437 | | |
438 | 438 | | |
439 | 439 | | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
440 | 452 | | |
441 | 453 | | |
442 | 454 | | |
| |||
649 | 661 | | |
650 | 662 | | |
651 | 663 | | |
| 664 | + | |
| 665 | + | |
| 666 | + | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
| 674 | + | |
| 675 | + | |
652 | 676 | | |
653 | 677 | | |
654 | 678 | | |
| |||
680 | 704 | | |
681 | 705 | | |
682 | 706 | | |
| 707 | + | |
| 708 | + | |
683 | 709 | | |
684 | 710 | | |
685 | 711 | | |
| |||
780 | 806 | | |
781 | 807 | | |
782 | 808 | | |
| 809 | + | |
| 810 | + | |
| 811 | + | |
| 812 | + | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
| 816 | + | |
| 817 | + | |
| 818 | + | |
| 819 | + | |
| 820 | + | |
| 821 | + | |
| 822 | + | |
| 823 | + | |
| 824 | + | |
| 825 | + | |
| 826 | + | |
| 827 | + | |
| 828 | + | |
| 829 | + | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
| 836 | + | |
| 837 | + | |
| 838 | + | |
| 839 | + | |
| 840 | + | |
| 841 | + | |
783 | 842 | | |
784 | 843 | | |
785 | 844 | | |
| |||
1221 | 1280 | | |
1222 | 1281 | | |
1223 | 1282 | | |
| 1283 | + | |
1224 | 1284 | | |
1225 | 1285 | | |
1226 | 1286 | | |
1227 | | - | |
1228 | | - | |
| 1287 | + | |
| 1288 | + | |
| 1289 | + | |
| 1290 | + | |
| 1291 | + | |
| 1292 | + | |
| 1293 | + | |
| 1294 | + | |
| 1295 | + | |
| 1296 | + | |
| 1297 | + | |
1229 | 1298 | | |
1230 | 1299 | | |
1231 | 1300 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
262 | 262 | | |
263 | 263 | | |
264 | 264 | | |
| 265 | + | |
265 | 266 | | |
266 | 267 | | |
267 | 268 | | |
| |||
407 | 408 | | |
408 | 409 | | |
409 | 410 | | |
| 411 | + | |
410 | 412 | | |
411 | 413 | | |
412 | 414 | | |
| |||
0 commit comments