Skip to content

fix: parquet limit pruning for row group selections#22942

Open
haohuaijin wants to merge 3 commits into
apache:mainfrom
haohuaijin:row-group-limit-selection-fix
Open

fix: parquet limit pruning for row group selections#22942
haohuaijin wants to merge 3 commits into
apache:mainfrom
haohuaijin:row-group-limit-selection-fix

Conversation

@haohuaijin

@haohuaijin haohuaijin commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

Limit pruning handled row groups with RowSelection incorrectly. It counted the full row group size and could replace a selection with a full scan.

What changes are included in this PR?

  • Preserve existing row selections.
  • Count only selected rows when checking the limit.
  • Add regression tests for both cases.

Are these changes tested?

Yes. New unit tests cover preserving RowSelection and counting selected rows during limit pruning.

Are there any user-facing changes?

No API changes.

@github-actions github-actions Bot added the datasource Changes to the datasource crate label Jun 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

datasource Changes to the datasource crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

limit pruning ignores RowSelection

1 participant