Skip to content

Added content-defined chunking#209

Open
antkve wants to merge 5 commits into
devfrom
ak-add-cdc
Open

Added content-defined chunking#209
antkve wants to merge 5 commits into
devfrom
ak-add-cdc

Conversation

@antkve

@antkve antkve commented Jun 17, 2026

Copy link
Copy Markdown

Added CDC to allow efficient mutability

  • Replace fixed-size chunking with FastCDC in the provider's S3 and FS upload paths so mid-file edits reuse most chunks via content-addressed dedup.
  • Add GET /content?data_root= for CID-keyed whole-file reassembly (existing /read assumes fixed-size chunks and can't serve CDC roots).

@antkve antkve linked an issue Jun 17, 2026 that may be closed by this pull request
# Conflicts:
#	primitives/src/lib.rs
#	provider-node/Cargo.toml
#	provider-node/src/api.rs
#	provider-node/tests/s3_integration.rs
@socket-security

socket-security Bot commented Jun 17, 2026

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedcargo/​fastcdc@​3.2.110010093100100

View full report

Comment thread primitives/Cargo.toml Outdated
Comment thread primitives/src/chunking.rs
Comment thread provider-node/src/api.rs Outdated
Comment thread provider-node/src/api.rs
expected: query.data_root.clone(),
actual: "invalid hex".to_string(),
})?;
let data_root = H256::from_slice(&root_bytes);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

H256::from_slice panics if root_bytes.len() != 32. hex_decode above validates hex characters but not length, so GET /content?data_root=abcd (2 bytes) panics → 500. Add a root_bytes.len() == 32 check returning Error::InvalidHash. See a similar pattern at other places too - maybe a shared parse_h256() helper would fix all of them?

Comment thread provider-node/src/api.rs
Comment on lines 363 to 365
let chunk_size = storage_primitives::DEFAULT_CHUNK_SIZE as u64;
let start_chunk = query.offset / chunk_size;
let end_chunk = (query.offset + query.length).div_ceil(chunk_size);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the most important one. Now that S3/FS uploads are always CDC, /read's offset / DEFAULT_CHUNK_SIZE math is wrong for essentially every new root, and get_chunk_at_index still returns a chunk - so /read silently serves misaligned bytes with a 200 instead of erroring. Please add a guard that makes /read reject non-fixed roots before merge (the variable-size implementation can stay deferred, but silent corruption shouldn't ship).

Comment thread provider-node/src/api.rs
Comment on lines +410 to +411
let chunks = state.storage.collect_chunks(data_root);
if chunks.is_empty() && data_root != H256::zero() {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_content reassembles the entire object into one in-memory Vec. Unlike /read (paginated by offset/length), this is whole-file, so memory scales with max object size × concurrent requests. Fine for now, but worth a tracking note to stream the body (axum::body::Body from a stream) before large objects are in play.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should stream it. It's fine if we add it here or in a follow-up PR but we need it

Comment thread docs/design/content-defined-chunking.md Outdated

**Unchanged.** The MMR is still leaf-per-chunk and chunk-size-agnostic — each leaf is `blake2_256(chunk_bytes)`. The Merkle proof model is unchanged. The S3/FS GET handlers still reassemble via `collect_chunks(data_root)`, which walks the tree by hash and doesn't assume any chunk size. Content addressing and per-bucket isolation are unchanged.

The `GET /read?data_root=&offset=&length=` byte-range endpoint **still assumes fixed-size** (its arithmetic computes `chunk_index = offset / DEFAULT_CHUNK_SIZE`). It is therefore unsafe to call on CDC-chunked roots. For whole-file historical fetch, use the new `GET /content?data_root=` endpoint, which walks the manifest. Updating `/read` for variable-size byte-range reads is deferred until a caller actually needs it.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR makes uploads always-CDC, /read is now wrong for the common case, not a rare one - please reframe from "deferred" to "guard now," and add a line that roots don't record their chunking strategy (so all offset math is globally unsafe until they do).

| Insert/delete dedup | ❌ Cascades downstream | ✅ Only chunks straddling the edit change |
| In-place overwrite dedup | ✅ Works | ✅ Works |
| Append dedup | ✅ Works (only trailing chunk changes) | ✅ Works |
| Best for | Binary blobs with fixed-offset fields; embedded use where every byte counts | Text, structured data, anything with shifting edits |

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth noting files < 64 KiB are always a single chunk and get no edit-dedup - the "best for text/structured data" framing oversells the small-file case.


The provider stores file bytes as content-addressed chunks: each chunk's key is `blake2_256(bytes)`, so two PUTs that produce a byte-identical chunk get stored once. That dedup is automatic — *if* the chunker actually produces identical chunks across edits.

With a fixed-size chunker, it doesn't. Inserting a single byte at the start of a file shifts every downstream byte by one, so every chunk has different bytes, so every chunk has a different hash. v2's PUT effectively re-uploads the entire file.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't understand what v2 meant at first. After reading I realized it was the new blob. It might be better to say "the new blob"


**Changed.** Chunks are now variable-size (between 64 KiB and 1 MiB). The provider's S3 (`PUT /s3/.../object`) and FS (`PUT /fs/.../file`) handlers chunk via `chunk_cdc_borrowed`. The Rust client SDK's `ChunkingStrategy::ContentDefined` arm uses the same chunker (previously a TODO that fell back to fixed-256K).

**Unchanged.** The MMR is still leaf-per-chunk and chunk-size-agnostic — each leaf is `blake2_256(chunk_bytes)`. The Merkle proof model is unchanged. The S3/FS GET handlers still reassemble via `collect_chunks(data_root)`, which walks the tree by hash and doesn't assume any chunk size. Content addressing and per-bucket isolation are unchanged.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MMR is chunk-size-agnostic but you still need to stick to one chunking strategy right? I'd mention that here

| Append dedup | ✅ Works (only trailing chunk changes) | ✅ Works |
| Best for | Binary blobs with fixed-offset fields; embedded use where every byte counts | Text, structured data, anything with shifting edits |

`ChunkingStrategy::Fixed(usize)` remains the SDK default; CDC is opt-in via `ChunkingStrategy::ContentDefined`. The provider's HTTP upload endpoints (S3 / FS) always use CDC since their callers don't pick a strategy.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then shouldn't CDC be the default for the SDK as well?

}

#[test]
fn cdc_size_distribution_within_bounds() {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed-size chunking would also pass this test. Can we add another one testing that the sizes of contiguous chunks are different?

Comment thread primitives/src/lib.rs
use scale_info::TypeInfo;
use sp_core::H256;

#[cfg(feature = "std")]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need std here? alloc is not std

Comment thread provider-node/Cargo.toml Outdated
[dev-dependencies]
tokio = { workspace = true, features = ["macros", "rt-multi-thread", "test-util", "time"] }
serde_json = { workspace = true }
rand = "0.8"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get it from the workspace

Comment thread provider-node/src/api.rs
Comment on lines +410 to +411
let chunks = state.storage.collect_chunks(data_root);
if chunks.is_empty() && data_root != H256::zero() {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should stream it. It's fine if we add it here or in a follow-up PR but we need it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Content-defined Chunking

4 participants