Skip to content

[C++][IPC] ReadSparseCSFIndex does not validate CSF index buffer counts against ndim #50161

@metsw24-max

Description

@metsw24-max

Describe the bug, including details regarding any error messages, version, and platform.

ReadSparseCSFIndex (cpp/src/arrow/ipc/reader.cc) sizes its indptr_data/indices_data vectors from the tensor shape (ndim - 1 and ndim) but fills them by looping over the flatbuffer-supplied indptrBuffers()->size()/indicesBuffers()->size(), which are independent fields and never checked against ndim. A crafted IPC SparseTensor CSF message with more buffer entries than dimensions writes shared_ptr<Buffer> elements past the end of those vectors (heap out-of-bounds write). ndim == 0 is also accepted and builds a vector of size SIZE_MAX.

The payload path is guarded by CheckSparseTensorBodyBufferCount, but the file/stream path is not, so this is reachable from the public ReadSparseTensor(io::InputStream*) API on untrusted bytes.

GetSparseCSFIndexMetadata (cpp/src/arrow/ipc/metadata_internal.cc) has the same shape: it loops over axisOrder()->size() while indexing indicesBuffers()->Get(i) without checking the two lengths match, an out-of-bounds read reachable from both paths.

Component(s)

C++

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions