Skip to content

feat(vortex-geo): geometry Bbox zone-map statistic + distance-filter pruning#8561

Open
HarukiMoriarty wants to merge 3 commits into
developfrom
nemo/geo-bounds-zonemap
Open

feat(vortex-geo): geometry Bbox zone-map statistic + distance-filter pruning#8561
HarukiMoriarty wants to merge 3 commits into
developfrom
nemo/geo-bounds-zonemap

Conversation

@HarukiMoriarty

Copy link
Copy Markdown
Contributor

Summary

Adds spatial chunk-pruning to Vortex. A new GeometryBounds aggregate stores a per-chunk minimum bounding box (MBR) as a zone-map statistic, and a stats-rewrite rule uses it to skip chunks that cannot satisfy a ST_Distance(geom, const) <= r filter.

Limitation

  • Only the <= / < are pruned. > / >= are soundly prunable via the symmetric farthest-corner bound but are intentionally omitted (rarely?)
  • Pruning is sound, but the performance is highly related with the geo column write order, selectivity depends on a spatially clustered layout (e.g. a Hilbert/Z-order sort) so chunk MBRs are tight and non-overlapping.

Testing

8 new vortex-geo tests. Point bbox across batches; Polygon bbox over all ring vertices, empty group → null, and registry self-declaration. only <=/< prune while >/>=/==/!= don't (parameterized), distance symmetry, non-distance comparisons ignored, and an end-to-end falsify.

Signed-off-by: Nemo Yu <zyu379@wisc.edu>
@HarukiMoriarty HarukiMoriarty added the changelog/feature A new feature label Jun 23, 2026
@HarukiMoriarty HarukiMoriarty requested a review from a team June 23, 2026 20:59
@codspeed-hq

codspeed-hq Bot commented Jun 23, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 2 improved benchmarks
❌ 3 regressed benchmarks
✅ 1584 untouched benchmarks
⏩ 4 skipped benchmarks1

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation chunked_bool_canonical_into[(1000, 10)] 16.3 µs 26.8 µs -39.12%
Simulation chunked_varbinview_canonical_into[(100, 100)] 224.2 µs 259.6 µs -13.64%
Simulation chunked_varbinview_into_canonical[(100, 100)] 271.4 µs 306.7 µs -11.52%
Simulation chunked_varbinview_into_canonical[(1000, 10)] 205.6 µs 169 µs +21.64%
Simulation bitwise_not_vortex_buffer_mut[128] 273.6 ns 244.4 ns +11.93%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing nemo/geo-bounds-zonemap (d789886) with develop (aeae579)

Open in CodSpeed

Footnotes

  1. 4 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

HarukiMoriarty and others added 2 commits June 24, 2026 13:04
The zoned writer serializes each zone stat descriptor to persist it, but GeometryBounds used the default (non-serializable), so writing a geometry column zone map failed at write time. Add serialize/deserialize (no options) plus a regression test.

Signed-off-by: Nemo Yu <zyu379@wisc.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/feature A new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant