Skip to content

Fix --preallocate --sparse to actually produce sparse files#916

Merged
tridge merged 1 commit into
RsyncProject:masterfrom
tridge:prealloc-sparse-fix
May 25, 2026
Merged

Fix --preallocate --sparse to actually produce sparse files#916
tridge merged 1 commit into
RsyncProject:masterfrom
tridge:prealloc-sparse-fix

Conversation

@tridge
Copy link
Copy Markdown
Member

@tridge tridge commented May 24, 2026

rsync.1 says combining --preallocate with --sparse yields sparse blocks wherever the filesystem can punch holes, but since 2019 (commit c2da380, "keep file-size 0 when possible") it has silently left the file fully allocated. Two problems, both rooted in that commit switching --preallocate / --inplace to fallocate(FALLOC_FL_KEEP_SIZE):

  • do_fallocate() then returned 0 instead of the reserved length, so the receiver's preallocated_len was 0 and write_sparse() always lseek'd over null runs instead of punching them (and the over-preallocation trim in receiver.c never fired either).

  • more fundamentally, KEEP_SIZE leaves the file size at 0 while data is written incrementally, so the FALLOC_FL_PUNCH_HOLE call lands on blocks beyond EOF and is a silent no-op -- the reserved blocks are never freed.

Fix both: don't request KEEP_SIZE when --sparse is also active, so the file is preallocated at full size and the punch lands within it; and return the reserved length from do_fallocate() so preallocated_len drives the punch decision and the over-allocation trim. --preallocate without --sparse keeps the KEEP_SIZE (file-size-0) behaviour.

preallocate_test.py now asserts via st_blocks (where the filesystem stores holes) that --preallocate --sparse ends up sparse, guarding the regression.

rsync.1 says combining --preallocate with --sparse yields sparse blocks
wherever the filesystem can punch holes, but since 2019 (commit c2da380,
"keep file-size 0 when possible") it has silently left the file fully
allocated. Two problems, both rooted in that commit switching --preallocate /
--inplace to fallocate(FALLOC_FL_KEEP_SIZE):

  * do_fallocate() then returned 0 instead of the reserved length, so the
    receiver's preallocated_len was 0 and write_sparse() always lseek'd over
    null runs instead of punching them (and the over-preallocation trim in
    receiver.c never fired either).

  * more fundamentally, KEEP_SIZE leaves the file size at 0 while data is
    written incrementally, so the FALLOC_FL_PUNCH_HOLE call lands on blocks
    beyond EOF and is a silent no-op -- the reserved blocks are never freed.

Fix both: don't request KEEP_SIZE when --sparse is also active, so the file is
preallocated at full size and the punch lands within it; and return the
reserved length from do_fallocate() so preallocated_len drives the punch
decision and the over-allocation trim. --preallocate without --sparse keeps
the KEEP_SIZE (file-size-0) behaviour. t_stub.c gains a sparse_files stub since
do_fallocate now references it and the test helpers link syscall.o.

preallocate_test.py now asserts via st_blocks (where the filesystem can punch
holes) that --preallocate --sparse ends up sparse, guarding the regression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@tridge tridge force-pushed the prealloc-sparse-fix branch from 1472bcb to 6aad80f Compare May 25, 2026 03:51
@tridge tridge merged commit 4f5a585 into RsyncProject:master May 25, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant