Add list_length scalar function#8495
Conversation
Merging this PR will improve performance by 12.62%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | chunked_varbinview_into_canonical[(1000, 10)] |
169 µs | 205.5 µs | -17.78% |
| ⚡ | Simulation | bitwise_not_vortex_buffer_mut[128] |
244.4 ns | 186.1 ns | +31.34% |
| ⚡ | Simulation | bitwise_not_vortex_buffer_mut[1024] |
304.7 ns | 246.4 ns | +23.68% |
| ⚡ | Simulation | bitwise_not_vortex_buffer_mut[2048] |
398.6 ns | 340.3 ns | +17.14% |
| ⚡ | Simulation | chunked_varbinview_canonical_into[(100, 100)] |
259 µs | 224.4 µs | +15.39% |
| ⚡ | Simulation | chunked_varbinview_into_canonical[(100, 100)] |
306.5 µs | 271.1 µs | +13.05% |
| 🆕 | Simulation | list_large |
N/A | 10 ms | N/A |
| 🆕 | Simulation | list_medium |
N/A | 144.3 µs | N/A |
| 🆕 | Simulation | list_small |
N/A | 59 µs | N/A |
| 🆕 | Simulation | listview_large |
N/A | 6 ms | N/A |
| 🆕 | Simulation | listview_medium |
N/A | 98.3 µs | N/A |
| 🆕 | Simulation | listview_small |
N/A | 39.3 µs | N/A |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing mk/list-length (51b1a4c) with develop (f1adef2)
0a2f1f1 to
1ed27e1
Compare
list_length scalar function
| fn return_dtype(&self, _options: &Self::Options, arg_dtypes: &[DType]) -> VortexResult<DType> { | ||
| match &arg_dtypes[0] { | ||
| DType::List(_, nullable) => Ok(DType::Primitive(PType::U64, *nullable)), | ||
| other => vortex_bail!("list_length() requires List, got {other}"), |
There was a problem hiding this comment.
May as well support FixedList as well, then implement reduce to collapse it into the constant
There was a problem hiding this comment.
Implemented reduce for nonnullable fsl, delegated nullable to execute since we can't easily get validity (talked offline)
| struct AnyList; | ||
|
|
||
| impl Matcher for AnyList { | ||
| type Match<'a> = (); |
There was a problem hiding this comment.
You should define a enum AnyListView { List(...), FixedList(...) } , then you can just match on it above in the execute_until
There was a problem hiding this comment.
do we want to execute FixedList? We can just get the size from the dtype
There was a problem hiding this comment.
You're not executing the FixedList itself, you're basically saying, run execution one step at a time until it matches one of these encodings.
So there may be some scalar function that happens to return a FixedList, then you will terminate and have access to it
Computes the number of elements in each list from the offsets/sizes only (never reading element values), returning a U64 array; a null list yields a null length. Registered as a built-in scalar function (vortex.list.length) alongside list_contains. Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Signed-off-by: Matt Katz <mhkatz97@gmail.com>
Adds a
list_lengthscalar function returning the number of elements in each list of aListarray.List,ListView, andFixedSizeListarrays.U64array; a null list yields a null length.vortex.list.length) alongsidelist_contains, and exposed via thelist_length(expr)expression constructor.