Describe the bug, including details regarding any error messages, version, and platform.
ReadSparseCSFIndex (cpp/src/arrow/ipc/reader.cc) sizes its indptr_data/indices_data vectors from the tensor shape (ndim - 1 and ndim) but fills them by looping over the flatbuffer-supplied indptrBuffers()->size()/indicesBuffers()->size(), which are independent fields and never checked against ndim. A crafted IPC SparseTensor CSF message with more buffer entries than dimensions writes shared_ptr<Buffer> elements past the end of those vectors (heap out-of-bounds write). ndim == 0 is also accepted and builds a vector of size SIZE_MAX.
The payload path is guarded by CheckSparseTensorBodyBufferCount, but the file/stream path is not, so this is reachable from the public ReadSparseTensor(io::InputStream*) API on untrusted bytes.
GetSparseCSFIndexMetadata (cpp/src/arrow/ipc/metadata_internal.cc) has the same shape: it loops over axisOrder()->size() while indexing indicesBuffers()->Get(i) without checking the two lengths match, an out-of-bounds read reachable from both paths.
Component(s)
C++
Describe the bug, including details regarding any error messages, version, and platform.
ReadSparseCSFIndex(cpp/src/arrow/ipc/reader.cc) sizes itsindptr_data/indices_datavectors from the tensor shape (ndim - 1andndim) but fills them by looping over the flatbuffer-suppliedindptrBuffers()->size()/indicesBuffers()->size(), which are independent fields and never checked againstndim. A crafted IPC SparseTensor CSF message with more buffer entries than dimensions writesshared_ptr<Buffer>elements past the end of those vectors (heap out-of-bounds write).ndim == 0is also accepted and builds a vector of sizeSIZE_MAX.The payload path is guarded by
CheckSparseTensorBodyBufferCount, but the file/stream path is not, so this is reachable from the publicReadSparseTensor(io::InputStream*)API on untrusted bytes.GetSparseCSFIndexMetadata(cpp/src/arrow/ipc/metadata_internal.cc) has the same shape: it loops overaxisOrder()->size()while indexingindicesBuffers()->Get(i)without checking the two lengths match, an out-of-bounds read reachable from both paths.Component(s)
C++