Skip to content

Comet throws RuntimeException instead of SparkException for invalid row index column type #3886

@andygrove

Description

@andygrove

Description

The Spark SQL test `ParquetRowIndexSuite: invalid row index column type - vectorized reader` expects a `SparkException` with condition `FAILED_READ_FILE` when the row index temporary column is declared with the wrong type (e.g., `StringType` instead of `LongType`).

Spark's vectorized reader wraps the `RuntimeException` from `findRowIndexColumnIndexInSchema` in a `SparkException` via `QueryExecutionErrors.cannotReadFilesError()`. Comet's `NativeBatchReader` lets the `RuntimeException` propagate unwrapped.

Steps to reproduce

Run the Spark SQL test suite `ParquetRowIndexSuite` with Comet enabled against Spark 4.0.1.

Expected behavior

`SparkException` with condition starting with `FAILED_READ_FILE` should be thrown, matching Spark's native behavior.

Actual behavior

`RuntimeException` is thrown directly.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions