support both filter + limit in ScanBuilder
#7562
Replies: 4 comments 5 replies
-
|
another context, if i enable pushdown_filter for parquet format, the above query will 5x slower than disable pushdown_filter in parquet format, see apache/arrow-rs#9765 |
Beta Was this translation helpful? Give feedback.
-
|
This is definitely a missing piece, let me see if I can dig something up, been on the back of my head for a long time now. Do you have a sense of how many rows more Vortex returns to the upstream DataFusion ExecutionNode? It should also just polling once it has enough rows. |
Beta Was this translation helpful? Give feedback.
-
|
Vortex still potentially plans a lot of work in that case, which might be the source of the higher |
Beta Was this translation helpful? Give feedback.
-
|
Also just to make sure I understand the relation ship performance, is it:
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
recently i do a benchmark for parquet vs vortex, i find the below case parquet is 2x faster than vortex:
the file is already order by
timedesc, the sql like belowdatafusion will generate the plan like below
but current vortex do not support
filter + limittogether, result in vortex read more records than parquet.vortex/vortex-layout/src/scan/scan_builder.rs
Lines 243 to 245 in dcd7097
vortex/vortex-datafusion/src/persistent/opener.rs
Lines 355 to 359 in dcd7097
Beta Was this translation helpful? Give feedback.
All reactions