Skip to content

fix: remap index data when its fragment bitmap was already coverage-remapped#7286

Merged
jackye1995 merged 1 commit into
lance-format:mainfrom
xuanyu-z:xuanyuzhan/fix-frag-reuse-trim-after-coverage-remap
Jun 17, 2026
Merged

fix: remap index data when its fragment bitmap was already coverage-remapped#7286
jackye1995 merged 1 commit into
lance-format:mainfrom
xuanyu-z:xuanyuzhan/fix-frag-reuse-trim-after-coverage-remap

Conversation

@xuanyu-z

Copy link
Copy Markdown
Contributor

Summary

remap_column_index can silently skip remapping an index's data after a deferred compaction, which leaves the index permanently dependent on the fragment-reuse index and prevents that reuse index from ever being trimmed (it grows without bound). This happens whenever the index's fragment_bitmap has already been coverage-remapped onto the new fragments before the data remap runs — most reliably when a table carries more than one index.

The fix makes remap_index decide whether to remap an index's data from whether that data predates the reuse version, instead of inferring it from the current state of the fragment bitmap (which can be cleaned independently of the data).

Background

With defer_index_remap: true, compaction rewrites fragments but does not rewrite index data inline. Instead it records a fragment-reuse index (FRI) holding, per compaction, the old→new row-address map. Two remaps then happen at different times:

  • Coverage remapload_indices (rust/lance/src/index.rs) applies FragReuseIndex::remap_fragment_bitmap to every index's fragment_bitmap in memory, swapping the compacted-away fragment ids for the new ones, so queries route correctly.
  • Data remapremap_column_index / remap_index (rust/lance/src/dataset/optimize/remapping.rs) actually rewrites the row addresses stored inside the index, and advances the index's dataset_version. Until this runs, the index data still holds the old addresses and relies on FRI auto-remap-at-load for correctness.

cleanup_frag_reuse_index can only trim a reuse version once every index is "caught up" with it (is_index_remap_caught_up), which requires both that the index no longer references the version's old fragments and that index.dataset_version >= reuse_version.dataset_version.

The bug

After a deferred compaction, remapping the indexes one by one leaves some of them un-remapped, and the reuse index never trims.

Reproduced deterministically with two scalar indexes on one table (see the added test). Observed state after compaction + remapping both indexes:

index dataset_version fragment_bitmap
FRI version 4
first index remapped 6 (advanced ✓) [10]
second index 2 (stale) [10]
third index 3 (stale) [10]

The second/third indexes' bitmaps are already clean ([10], no old fragments), but their data was never remapped and their dataset_version never advanced — so is_index_remap_caught_up returns false on the dataset_version < reuse_version.dataset_version check, and the reuse version is retained forever.

Root cause

The deferred-compaction commit itself does not coverage-remap the on-disk bitmaps (with stable row ids disabled, Operation::Rewrite takes the handle_rewrite_indices path with an empty rewritten_indices, leaving bitmaps untouched). The cleaned bitmap is introduced later, by the remap path itself:

  1. remap_index begins with dataset.load_indices(), which returns every index's bitmap coverage-remapped in memory onto the new fragments.
  2. When it commits the first index's remap, the new manifest is built from that coverage-remapped in-memory index list — so the cleaned bitmap is persisted to disk for the other, not-yet-remapped indexes, without their data being remapped or their dataset_version being advanced.
  3. remap_index for those remaining indexes computes should_remap from old_frag_in_index > 0. The old fragments are already gone from the persisted bitmap, so should_remap is false → the data remap is skipped → the index keeps its stale addresses and stale dataset_version → the reuse index can never be trimmed.

In short: the fragment bitmap is an unreliable signal for "has this index's data been remapped?", because coverage remap and data remap are decoupled and the cleaned bitmap can be persisted before the data remap happens.

Why not "fix" the trim check instead?

The dataset_version guard in is_index_remap_caught_up is correct and protective: an index whose bitmap is coverage-remapped but whose data still holds old addresses genuinely still needs the reuse index for auto-remap-at-load. Advancing dataset_version (or trimming) on the strength of a coverage remap alone would discard the reuse index while the data is still stale → incorrect query results. So the data remap must actually run; the fix belongs in remap_index.

The fix

rust/lance/src/dataset/optimize/remapping.rs, in remap_index: when a rewrite group's old fragments are already gone from the bitmap but its new fragments are present, and the index data still predates the reuse version (index.dataset_version < version.dataset_version), set should_remap = true anyway. The bitmap is already correct in that case, so only the data is remapped; dataset_version then advances and the reuse index can trim.

The existing "old fragments present" path is left byte-for-byte unchanged, so the normal (non-pre-cleaned) case — including chained single-index bitmap remapping — behaves exactly as before.

let data_predates_version = curr_index_meta.dataset_version < version.dataset_version;
// ... existing per-group loop ...
if old_frag_in_index > 0 {
    // unchanged: rewrite the bitmap and remap the data
} else if data_predates_version
    && group.new_frags.iter().any(|f| index_frag_bitmap.contains(f.id as u32))
{
    // bitmap already coverage-remapped + persisted before the data remap:
    // remap the data anyway (bitmap is already correct)
    should_remap = true;
}

Testing

  • New regression test test_cleanup_frag_reuse_index_multiple_indices (rust/lance/src/dataset/index/frag_reuse.rs): two scalar indexes on one table, deferred compaction, remap both, then assert each index is caught up and the reuse index trims to zero versions. Fails on main (index j_idx was not caught up after remap), passes with this change.
  • Existing suites green with the fix: frag_reuse (incl. the single-index test_cleanup_frag_reuse_index), remap (incl. test_remap_index_after_compaction, which exercises chained single-index bitmap remapping), and compaction.
  • cargo fmt + cargo clippy clean.

Scope / risk

  • ~20 lines, one new else if branch; no public API change.
  • The change only ever enables a data remap that should have happened; it cannot skip one the old code performed (the new branch is additive to the existing should_remap conditions).

@github-actions github-actions Bot added the bug Something isn't working label Jun 16, 2026
…emapped

`remap_column_index` decided whether to remap an index's data from the
presence of the old fragments in its fragment bitmap. But `load_indices`
coverage-remaps the bitmap onto the new fragments in memory, and a later
commit can persist that cleaned bitmap to disk before the index data is
remapped -- for example while remapping a sibling index on the same table.

Once the bitmap is clean, the data remap was silently skipped: the index
kept serving stale row addresses (corrected only by reuse-index auto-remap
at load), and its dataset_version never advanced, so the fragment reuse
index could never be trimmed and grew without bound.

Detect the already-coverage-remapped case (the group's old fragments are
gone from the bitmap but its new fragments are present, and the index data
still predates the reuse version) and remap the data in that case too.

Adds a regression test exercising two indexes on one table.
@xuanyu-z xuanyu-z marked this pull request as ready for review June 16, 2026 22:58
@xuanyu-z xuanyu-z force-pushed the xuanyuzhan/fix-frag-reuse-trim-after-coverage-remap branch from 9083b6b to 0b6d437 Compare June 16, 2026 22:58
@codecov

codecov Bot commented Jun 16, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@jackye1995 jackye1995 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch!

@jackye1995 jackye1995 merged commit 747fa0b into lance-format:main Jun 17, 2026
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants