fix: return error instead of panicking on zero-dimension fixed-size-list columns#7247
fix: return error instead of panicking on zero-dimension fixed-size-list columns#7247DanielMao1 wants to merge 1 commit into
Conversation
143f1e1 to
240a45a
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
|
||
|
|
||
| @pytest.mark.parametrize("data_storage_version", ["legacy", "stable", "2.1"]) | ||
| def test_write_zero_dimension_fixed_size_list( |
There was a problem hiding this comment.
Can we at least add one test case about reading existed zero dimension by the old writer logic?
There was a problem hiding this comment.
Added as a Rust test rather than Python: once Schema::validate() rejects zero-dim FSL, no write path can produce such a dataset (and the encoder panics on zero-dim), so the fixture can't be generated by current code.
Instead, test_read_zero_dimension_fsl_errors_instead_of_panicking (in decoder.rs) drives create_structural_field_scheduler / create_legacy_field_scheduler — the read-plan builders the file reader calls against a stored schema — with a zero-dim FSL field, asserting a clean error instead of a panic.
| if let DataType::FixedSizeList(_, dimension) = field.data_type() | ||
| && dimension <= 0 |
There was a problem hiding this comment.
Does it work for the FSL of FSL mode, e.g. FixedSizeList(FixedSizeList(Float32, 0), 4)?
There was a problem hiding this comment.
Good catch — For my previous PR it did not. Because a FixedSizeList of a FixedSizeList over a primitive collapses into a single leaf field, the pre-order field walk never visits the inner list, and the original if let DataType::FixedSizeList(_, dimension) only matched the outermost dimension. So FixedSizeList(FixedSizeList(Float32, 0), 4) slipped through (outer dim 4 passed, inner dim 0 ignored).
I confirmed this with a probe test, then fixed it: the check now lives in a helper validate_fixed_size_list_dimensions() that recurses through nested list types, so an inner zero dimension is rejected too. Added FixedSizeList(FixedSizeList(Float32, 0), 4) (and a positive-dimension nested counterpart) to the schema test.
7827869 to
4d0d4b5
Compare
| /// so letting such a type through panics with a divide-by-zero during | ||
| /// scheduling. Such datasets can exist on disk because old writers did not | ||
| /// validate the dimension (see issue #5102). | ||
| fn validate_fsl_dimension(field: &Field, data_type: &DataType) -> Result<()> { |
There was a problem hiding this comment.
Is it duplicated with validate_fixed_size_list_dimensions?
There was a problem hiding this comment.
Right,it may be deduplicated. Promoted validate_fixed_size_list_dimensions in lance-core to pub and call it from the decoder's scheduler builders, removing the duplicate validate_fsl_dimension. Both write and read paths now share one implementation.
| fn validate_fsl_dimension(field: &Field, data_type: &DataType) -> Result<()> { | ||
| if let DataType::FixedSizeList(inner, dimension) = data_type { | ||
| if *dimension <= 0 { | ||
| return Err(Error::invalid_input(format!( |
There was a problem hiding this comment.
Does Error::schema look better?
There was a problem hiding this comment.
Agreed. The shared lance-core helper uses Error::schema, so after the dedup the read path now returns Error::schema too instead of invalid_input — consistent with the write path.
002c5a6 to
e1041bb
Compare
…ist columns A fixed-size-list column with dimension 0 used to panic with 'attempt to divide by zero' (rust/lance-encoding/src/data.rs) on every write path and when reading datasets persisted by older writers. Reject such columns with a clean schema error, following the maintainer guidance in lance-format#5102 (error, not panic). A shared helper `validate_fixed_size_list_dimensions` (in lance-core) is used on both paths: - Write path: called from Schema::validate(), which runs inside Schema::try_from(&ArrowSchema), so all write entry points are covered. - Read path: called from the decoder field-scheduler builders, so datasets persisted by old writers fail cleanly instead of panicking at scheduling time. A FixedSizeList of a FixedSizeList over a primitive collapses into a single leaf field, so the helper recurses through nested list types to also reject an inner zero dimension, e.g. FixedSizeList(FixedSizeList(Float32, 0), 4). Closes lance-format#5102
e1041bb to
d758cd1
Compare
Closes #5102
Problem
A fixed-size-list column with dimension 0 panics with
attempt to divide by zero(
rust/lance-encoding/src/data.rs,FixedSizeListBlock::num_values). As of pylance 7.0.0the panic fires on write for every storage version (
stable/2.1/2.2), and readingdatasets persisted by older writers (which accepted such columns) panics as well.
Reproduction details are in the issue comment:
#5102 (comment)
Approach
Following the maintainer guidance in #5102 (error, not panic), this adds two small guards at
boundaries that already return
Result, instead of changingDataBlock::num_values()toreturn
Result(the approach that made #5159 balloon across the whole encoding crate):Schema::validate()rejects zero-dimension fixed-size-list fields(including nested ones).
validate()runs insideSchema::try_from(&ArrowSchema),so every write entry point surfaces a clean schema error instead of a panic. Writes
currently panic on every storage version, so no working flow changes behavior.
zero-dimension fixed-size lists with an invalid-input error, so datasets persisted by
old writers fail cleanly at scheduling time instead of crashing the process.
How the guards sit in the data flow
Two facts that shape the design:
Schema::try_from(&ArrowSchema)callsvalidate()internally and every write path performsthis conversion, so guard 1 in one place covers all write entry points.
under the
stable(2.0) storage version; reading those files must not crash the process.Tests
lance-core:Schema::try_fromrejects zero-dim FSL at top level and nested in a struct;positive dimensions still validate.
lance-encoding: the scheduler guard rejects zero-dim FSL, including FSL-nested-in-FSL,and accepts positive dimensions.
legacy/stable/2.1,write_datasetnow raises a cleanOSError(same mapping as other schema validation errors) instead ofPanicException.