Skip to content

IndexOutOfBoundsException in UDAF #2045

@Flyangz

Description

@Flyangz

Describe the bug
When executing Spark UDAFs, an IndexOutOfBoundsException occurs on the Java side during the importRows call. This happens when the native engine passes a sliced Arrow array to Java via FFI. The Rust implementation was passing the sliced array directly, which retains the original buffer and non-zero offsets. The Java consumer, however, expects the Arrow array to be self-contained with 0-based offsets, leading to an out-of-bounds access when it tries to read from the buffer using the raw offsets.

Additional context

Exception in thread "auron native task 331.0 in stage 6.0 (TID 1693)" java.lang.IndexOutOfBoundsException: index: 1068032, length: 120 (expected: range(0, 192600)) 
at auron.org.apache.arrow.memory.ArrowBuf.checkIndex(ArrowBuf.java:702) 
at auron.org.apache.arrow.memory.ArrowBuf.getBytes(ArrowBuf.java:729) 
at auron.org.apache.arrow.vector.VarBinaryVector.get(VarBinaryVector.java:112) 
at org.apache.spark.sql.auron.TypedImperativeEvaluator.$anonfun$importRows$6(SparkUDAFWrapperContext.scala:451) 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions