[fix](be) Restore mutable ownership in COW paths#63001
[fix](be) Restore mutable ownership in COW paths#63001zclllyybb wants to merge 1 commit intoapache:masterfrom
Conversation
### What problem does this PR solve?
Issue Number: N/A
Related PR: N/A
Problem Summary: assume_mutable now asserts exclusive ownership, so paths that moved columns out of blocks or borrowed shared subcolumns must return ownership explicitly or mutate through owning COW handles. This restores COW-safe mutation in block reuse, variant extraction, parquet/orc conversion, memtable aggregation, compaction readers, result buffering, and affected BE unit tests.
### Release note
None
### Check List (For Author)
- Test: Unit Test
- ./run-be-ut.sh --run -j 100
- ./run-be-ut.sh --run --filter=LocalExchangerTest.*
- ./run-be-ut.sh --run --filter=ArrowResultBlockBufferTest.*:BitUtil.CountZero:Parameters/TestRowIdConversion.*
- ./run-be-ut.sh --run --filter=VariantColumnWriterReaderTest.*:HierarchicalDataIteratorTest.*
- ./run-be-ut.sh --run --filter=VariantColumnWriterReaderTest.test_nested_iter_nullable:VariantColumnWriterReaderTest.test_streaming_compaction_writer_streams_regular_array_paths_across_batches:TabletCooldownTest.normal
- PATH=/mnt/disk6/common/ldb_toolchain_toucan/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.bun/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.codex/tmp/arg0/codex-arg0w0dN9c:/mnt/disk6/common/node-v24.14.1-linux-x64/lib/node_modules/@openai/codex/node_modules/@openai/codex-linux-x64/vendor/x86_64-unknown-linux-musl/path:/mnt/disk3/zhaochangle/.bun/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk3/zhaochangle/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/usr/share/Modules/bin:/usr/lib64/ccache:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin ./build-support/clang-format.sh
- git diff --cached --check
- PATH=/mnt/disk6/common/ldb_toolchain_toucan/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.bun/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.codex/tmp/arg0/codex-arg0w0dN9c:/mnt/disk6/common/node-v24.14.1-linux-x64/lib/node_modules/@openai/codex/node_modules/@openai/codex-linux-x64/vendor/x86_64-unknown-linux-musl/path:/mnt/disk3/zhaochangle/.bun/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk3/zhaochangle/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/usr/share/Modules/bin:/usr/lib64/ccache:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin ./build-support/check-format.sh (attempted; blocked by pre-existing formatting drift in be/src/exec/operator/distinct_streaming_aggregation_operator.cpp)
- PATH=/mnt/disk6/common/ldb_toolchain_toucan/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.bun/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.codex/tmp/arg0/codex-arg0w0dN9c:/mnt/disk6/common/node-v24.14.1-linux-x64/lib/node_modules/@openai/codex/node_modules/@openai/codex-linux-x64/vendor/x86_64-unknown-linux-musl/path:/mnt/disk3/zhaochangle/.bun/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/mnt/disk3/zhaochangle/.opencode/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk3/zhaochangle/.local/bin:/mnt/disk3/zhaochangle/bin:/mnt/disk6/common/apache-maven-3.9.14/bin:/mnt/disk6/common/ldb_toolchain_028/bin:/mnt/disk6/common/jdk-17.0.16/bin:/mnt/disk6/common/node-v24.14.1-linux-x64/bin:/usr/share/Modules/bin:/usr/lib64/ccache:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin ./build-support/run-clang-tidy.sh --build-dir be/ut_build_ASAN (attempted; blocked by existing clang-tidy analyzer errors from be/src/util/jni-util.h static_assert(false) and pre-existing diagnostics)
- Behavior changed: No
- Does this need documentation: No
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
|
/review |
There was a problem hiding this comment.
I found a blocking COW correctness issue in complex-type deserialization. The PR changes these paths from mutating the existing child through assume_mutable() to mutating a possibly cloned child, but the cloned child is not written back to the parent complex column. That means deserialization can update offsets while dropping child data when nested children are shared, which is exactly the ownership case this PR is trying to make safe.
Critical checkpoint conclusions:
- Goal/test: The goal is to restore COW-safe mutation after assume_mutable() enforces exclusive ownership. The broad BE unit-test list helps, but the complex-type deserialize path still has an uncovered shared-child case.
- Scope/focus: The change is focused on COW ownership, but several deserialize updates need the same write-back pattern used elsewhere after mutating detached children.
- Concurrency/lifecycle/config/compatibility: No new concurrency, lifecycle, configuration, or storage-format compatibility concern identified in the reviewed changes.
- Parallel paths: Array, Map, and Struct deserialize have the same detached-child pattern and all need to be fixed consistently.
- Tests: Please add or adjust coverage for deserializing complex columns whose child ColumnPtr is shared so this does not regress.
- Observability/persistence/write correctness/performance: No additional observability, persistence, transaction, or performance blocker found beyond the data-correctness issue above.
User focus: No additional user-provided review focus was specified.
| // children | ||
| auto nested_column = data_column->get_data_ptr()->assume_mutable(); | ||
| auto nested_column = std::move(*data_column->get_data_ptr()).mutate(); | ||
| buf = get_nested_type()->deserialize(buf, &nested_column, be_exec_version); |
There was a problem hiding this comment.
This now mutates a local nested_column, but if get_data_ptr() is shared, mutate() clones the child and deserialization fills only that clone. Since the clone is never assigned back to the ColumnArray, the array keeps the old child while offsets have already been resized/copied, producing an inconsistent deserialized array. Please store nested_column back into data_column->get_data_ptr() after get_nested_type()->deserialize(...).
| auto nested_keys_column = map_column->get_keys_ptr()->assume_mutable(); | ||
| auto nested_values_column = map_column->get_values_ptr()->assume_mutable(); | ||
| auto nested_keys_column = std::move(*map_column->get_keys_ptr()).mutate(); | ||
| auto nested_values_column = std::move(*map_column->get_values_ptr()).mutate(); |
There was a problem hiding this comment.
Same COW ownership issue here: mutate() may return cloned key/value children when the map subcolumns are shared, but the deserialized nested_keys_column and nested_values_column are dropped instead of being written back to map_column. That leaves updated offsets with stale keys/values. Please assign both mutated children back to get_keys_ptr() and get_values_ptr() after deserializing them.
| for (size_t i = 0; i < elems.size(); ++i) { | ||
| auto child_column = struct_column->get_column_ptr(i)->assume_mutable(); | ||
| auto child_column = std::move(*struct_column->get_column_ptr(i)).mutate(); | ||
| buf = elems[i]->deserialize(buf, &child_column, be_exec_version); |
There was a problem hiding this comment.
If a struct child is shared, this mutate() clones it; elems[i]->deserialize() then writes into the clone, but the clone is discarded at the end of the loop iteration. The parent ColumnStruct continues to reference the old child. Please write child_column back to struct_column->get_column_ptr(i) after deserialization.
What problem does this PR solve?
Issue Number: N/A
Related PR: N/A
Problem Summary: assume_mutable now asserts exclusive ownership, so paths that moved columns out of blocks or borrowed shared subcolumns must return ownership explicitly or mutate through owning COW handles. This restores COW-safe mutation in block reuse, variant extraction, parquet/orc conversion, memtable aggregation, compaction readers, result buffering, and affected BE unit tests.