Skip to content

Conversation

@wirybeaver
Copy link
Contributor

@wirybeaver wirybeaver commented Jan 17, 2026

  • Create an async controller zk jobs to deeply backfill historical segments prior to the inducted consumer watermarks. The deep copy means replicating segments to a standalone remote storage path.
  • Introduced TableReplicator to fetch segment ZK metadata from the source, register a controller job, and submit 4 worker tasks (can be fine tuned in the future) that call a SegmentCopier for each segment; errors immediately trigger job updates and completed segments are batched for ZK updates.
  • Added ZkBasedTableReplicationObserver and TableReplicationProgressStats to keep in-memory progress (remaining count and failed segment list) and persist job metadata to ZK (including a SEGMENTS_TO_BE_COPIED field) every 100 completions or immediately on error.
  • Implemented SegmentCopier interface with RealtimeSegmentCopier (deep-copy mode) that copies segments between the same PinotFS scheme and uploads the segment URI to the destination controller; if a segment lacks a deep-store download URL the code logs the problem and records failures.

@wirybeaver wirybeaver changed the title Oss/table copy backfill [Real-time Table Replication X clusters][2/n] Create a Controller Job for historical segments backfill Jan 17, 2026
@wirybeaver wirybeaver force-pushed the oss/tableCopyBackfill branch from f4275aa to 79ff96a Compare January 22, 2026 01:21
@wirybeaver wirybeaver force-pushed the oss/tableCopyBackfill branch from 79ff96a to 91b71db Compare January 22, 2026 05:04
@codecov-commenter
Copy link

codecov-commenter commented Jan 22, 2026

Codecov Report

❌ Patch coverage is 66.19048% with 71 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.15%. Comparing base (ea2601e) to head (b4b9ec1).
⚠️ Report is 11 commits behind head on master.

Files with missing lines Patch % Lines
.../helix/core/replication/RealtimeSegmentCopier.java 57.57% 23 Missing and 5 partials ⚠️
...roller/helix/core/replication/TableReplicator.java 60.41% 17 Missing and 2 partials ⚠️
...ntroller/helix/core/PinotHelixResourceManager.java 42.30% 13 Missing and 2 partials ⚠️
...oller/api/resources/PinotTableRestletResource.java 0.00% 4 Missing ⚠️
...e/replication/ZkBasedTableReplicationObserver.java 90.62% 2 Missing and 1 partial ⚠️
...ller/helix/core/replication/NoOpSegmentCopier.java 0.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #17521      +/-   ##
============================================
- Coverage     63.22%   63.15%   -0.08%     
+ Complexity     1477     1476       -1     
============================================
  Files          3170     3178       +8     
  Lines        189544   190013     +469     
  Branches      29009    29063      +54     
============================================
+ Hits         119843   119999     +156     
- Misses        60405    60689     +284     
- Partials       9296     9325      +29     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 ?
java-11 55.50% <0.00%> (-7.66%) ⬇️
java-21 63.12% <66.19%> (-0.08%) ⬇️
temurin 63.15% <66.19%> (-0.08%) ⬇️
unittests 63.14% <66.19%> (-0.08%) ⬇️
unittests1 55.52% <0.00%> (-0.05%) ⬇️
unittests2 34.03% <66.19%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@wirybeaver wirybeaver force-pushed the oss/tableCopyBackfill branch 2 times, most recently from 50bb63d to cb4267f Compare January 26, 2026 03:56
@wirybeaver wirybeaver force-pushed the oss/tableCopyBackfill branch from cb4267f to b4b9ec1 Compare January 26, 2026 06:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants