POC for size and timeout based flush at SourceShipper #7591

ankitsol · 2026-01-05T17:59:13Z

This is followup POC of #7528

This is needed for continuous backup and PITR feature: https://github.com/apache/hbase/pull/7445/files

Here instead of ReplicationEndpoint handling flush and offset update, ReplicationSourceShipper is using time based and size based logic to flush and update replication offset

Please ignore the failing tests

Apache-HBase · 2026-01-05T18:50:01Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	1m 23s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	hbaseanti	0m 0s		Patch does not have any anti-patterns.
			_ HBASE-28957_rebased Compile Tests _
+0 🆗	mvndep	0m 14s		Maven dependency ordering for branch
+1 💚	mvninstall	2m 49s		HBASE-28957_rebased passed
+1 💚	compile	2m 59s		HBASE-28957_rebased passed
+1 💚	checkstyle	0m 54s		HBASE-28957_rebased passed
+1 💚	spotbugs	1m 37s		HBASE-28957_rebased passed
+1 💚	spotless	0m 40s		branch has no errors when running spotless:check.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 10s		Maven dependency ordering for patch
+1 💚	mvninstall	2m 16s		the patch passed
+1 💚	compile	2m 58s		the patch passed
-0 ⚠️	javac	0m 24s	/results-compile-javac-hbase-backup.txt	hbase-backup generated 2 new + 141 unchanged - 0 fixed = 143 total (was 141)
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 53s		the patch passed
+1 💚	spotbugs	1m 46s		the patch passed
+1 💚	hadoopcheck	8m 37s		Patch does not cause any errors with Hadoop 3.3.6 3.4.1.
+1 💚	spotless	0m 34s		patch has no errors when running spotless:check.
			_ Other Tests _
+1 💚	asflicense	0m 14s		The patch does not generate ASF License warnings.
		33m 38s

Subsystem	Report/Notes
Docker	ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7591/1/artifact/yetus-general-check/output/Dockerfile
GITHUB PR	#7591
Optional Tests	dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname	Linux 901b51fb6aed 6.8.0-1024-aws #26~22.04.1-Ubuntu SMP Wed Feb 19 06:54:57 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	HBASE-28957_rebased / `baf7146`
Default Java	Eclipse Adoptium-17.0.11+9
Max. process+thread count	87 (vs. ulimit of 30000)
modules	C: hbase-server hbase-backup U: .
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7591/1/console
versions	git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by	Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2026-01-05T23:35:08Z

💔 -1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	1m 27s		Docker mode activated.
-0 ⚠️	yetus	0m 3s		Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
			_ Prechecks _
			_ HBASE-28957_rebased Compile Tests _
+0 🆗	mvndep	0m 13s		Maven dependency ordering for branch
+1 💚	mvninstall	2m 46s		HBASE-28957_rebased passed
+1 💚	compile	1m 0s		HBASE-28957_rebased passed
+1 💚	javadoc	0m 33s		HBASE-28957_rebased passed
+1 💚	shadedjars	4m 33s		branch has no errors when building our shaded downstream artifacts.
			_ Patch Compile Tests _
+0 🆗	mvndep	0m 12s		Maven dependency ordering for patch
+1 💚	mvninstall	2m 17s		the patch passed
+1 💚	compile	1m 0s		the patch passed
+1 💚	javac	1m 0s		the patch passed
+1 💚	javadoc	0m 31s		the patch passed
+1 💚	shadedjars	4m 31s		patch has no errors when building our shaded downstream artifacts.
			_ Other Tests _
-1 ❌	unit	238m 21s	/patch-unit-hbase-server.txt	hbase-server in the patch failed.
-1 ❌	unit	57m 13s	/patch-unit-hbase-backup.txt	hbase-backup in the patch failed.
		318m 44s

Subsystem	Report/Notes
Docker	ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7591/1/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR	#7591
Optional Tests	javac javadoc unit compile shadedjars
uname	Linux 233ac954ebde 6.8.0-1024-aws #26~22.04.1-Ubuntu SMP Wed Feb 19 06:54:57 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	HBASE-28957_rebased / `baf7146`
Default Java	Eclipse Adoptium-17.0.11+9
Test Results	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7591/1/testReport/
Max. process+thread count	5362 (vs. ulimit of 30000)
modules	C: hbase-server hbase-backup U: .
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7591/1/console
versions	git=2.34.1 maven=3.9.8
Powered by	Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

vinayakphegde · 2026-01-06T14:26:50Z

...src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java

+  }
+
+  private void flushStagedWal() {
+    source.getReplicationEndpoint().beforePersistingReplicationOffset();


What if persisting the WAL entries fails at the endpoint? In that case, we should throw an exception and fail here as well, since we cannot move forward until the entries are successfully persisted at the endpoint.

Apache9

I suggest we implement the POC based on the current master code base, and then applied the continous backup code on top of it. Otherwise it is a bit difficult to figure out what is the real modification for implementing this mechanism and what is because we have already supported continuius backup...

Apache9 · 2026-01-07T06:47:12Z

...src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceShipper.java

    // Loop until we close down
    while (isActive()) {
+      // check if flush needed for WAL backup, this is need for timeout based flush
+      if (shouldFlushStagedWal()) {


The name is a bit confusing, which still assumes that the WAL entries we send to ReplicationEndpoint are 'staged'...

Just name it 'shouldPersistLogPosition'. And we already have a logPositionAndCleanOldLogs method?

POC for size and timeout based flush at SourceShipper

baf7146

vinayakphegde reviewed Jan 6, 2026

View reviewed changes

ankitsol mentioned this pull request Jan 6, 2026

POC to avoid usage of ReplicationResult #7528

Open

Apache9 reviewed Jan 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

POC for size and timeout based flush at SourceShipper #7591

POC for size and timeout based flush at SourceShipper #7591

ankitsol commented Jan 5, 2026 •

edited

Loading

Uh oh!

Apache-HBase commented Jan 5, 2026

Uh oh!

Apache-HBase commented Jan 5, 2026

Uh oh!

vinayakphegde Jan 6, 2026

Uh oh!

Apache9 left a comment

Uh oh!

Apache9 Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

POC for size and timeout based flush at SourceShipper #7591

Are you sure you want to change the base?

POC for size and timeout based flush at SourceShipper #7591

Conversation

ankitsol commented Jan 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Apache-HBase commented Jan 5, 2026

Uh oh!

Apache-HBase commented Jan 5, 2026

Uh oh!

vinayakphegde Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Apache9 left a comment

Choose a reason for hiding this comment

Uh oh!

Apache9 Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ankitsol commented Jan 5, 2026 •

edited

Loading