Skip to content

RATIS-2393 Add Span Context to RaftRpcRequestProto#1341

Merged
szetszwo merged 13 commits intoapache:masterfrom
taklwu:RATIS-2393
Mar 12, 2026
Merged

RATIS-2393 Add Span Context to RaftRpcRequestProto#1341
szetszwo merged 13 commits intoapache:masterfrom
taklwu:RATIS-2393

Conversation

@taklwu
Copy link
Contributor

@taklwu taklwu commented Feb 4, 2026

What changes were proposed in this pull request?

We're adding a new map to RaftRpcRequestProto that will be used for upcoming Opentelemetry integration.

see the usage of PoC https://github.com/taklwu/ratis/blob/opentelemetry0129/ratis-common/src/main/java/org/apache/ratis/trace/TraceUtils.java#L235-L251

Another official reference for how this map is going to be inject and extract from the http / rpc header

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/RATIS-2393

How was this patch tested?

running mvn clean package -DskipTests

@taklwu
Copy link
Contributor Author

taklwu commented Feb 4, 2026

@szetszwo Hi Nicholas, should I directly open PRs to master branch ?

in addition , the test failure doesn't seem to be related to my changes

[INFO] Running org.apache.ratis.netty.TestLeaderElectionWithNetty
OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
Error:  Tests run: 25, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 74.14 s <<< FAILURE! -- in org.apache.ratis.netty.TestLeaderElectionWithNetty
Error:  org.apache.ratis.netty.TestLeaderElectionWithNetty.testChangeListenerToFollower -- Time elapsed: 1.509 s <<< FAILURE!
org.opentest4j.AssertionFailedError: expected: <0> but was: <1>
	at org.apache.ratis.server.impl.LeaderElectionTests.testChangeListenerToFollower(LeaderElectionTests.java:562)
	at java.base/java.lang.reflect.Method.invoke(Method.java:569)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)


Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taklwu , thanks for working on this! Please see the comment inlined.

BTW, this change is very small. Let's combine the usage of the new SpanContextProto such as RATIS-2395 or some part of it. Otherwise, it is hard to determine whether this change is good or not.

@OneSizeFitsQuorum
Copy link
Contributor

@taklwu Please add license for the failed ci!

@taklwu
Copy link
Contributor Author

taklwu commented Mar 3, 2026

@OneSizeFitsQuorum I have updated the license and good catch.

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taklwu , thanks for the update!

Please see the comments inlined and also https://issues.apache.org/jira/secure/attachment/13081062/1341_review.patch

taklwu added 7 commits March 5, 2026 08:38
 - First commit of using OpenTelemetry
 - use junit5 and default opentelemetry extension in test case.
 - Add test to mock client level span and inject when request is being sent
 - only include the server extract
1. use traceAsyncMethod
2. add ClientInvocationId
3. and other places
@taklwu
Copy link
Contributor Author

taklwu commented Mar 5, 2026

sorry that I rebased on top of master and messed up the review history, but the last commit is the only change for addressing the recent comments by @szetszwo .

@taklwu

This comment was marked as off-topic.

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taklwu , thanks for the update! Please see the comments inlined.

*/
package org.apache.ratis.server.impl;

import io.opentelemetry.api.trace.Span;
Copy link
Contributor

@szetszwo szetszwo Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that I may not be clear in my last comment -- we should import io.opentelemetry only in ratis-common. Then, it will be easier to change the tracing dependency to be pluggable in the future.

We often see that a library may introduce incompatible changes (such as Dropwizard Metrics v3 vs v4). Since Ratis itself is also a library, we don't want to force our consumer applications to use a particular dependency (e.g. Ratis supports both Dropwizard Metrics v3 and v4 by making it pluggable). Another reason is that a dependency may have CVEs (e.g. the infamous Log4Shell case). Applications may choose to disable it.

BTW, we should add a conf to enable/disable tracing.

- use lambda instead of implements supplier
- move most the code into ratis-common
Copy link
Contributor Author

@taklwu taklwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@szetszwo can you please review it again ? thanks.

Comment on lines +89 to +96
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-sdk</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.opentelemetry</groupId>
<artifactId>opentelemetry-sdk-testing</artifactId>
<scope>test</scope>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless I test differently, otherwise, I see these lines are required...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taklwu , could you take a look https://issues.apache.org/jira/secure/attachment/13081062/1341_review.patch ? It does not add any dependencies to ratis-server.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I missed that, I moved them all into ratis-common as transitive dependencies.

@szetszwo
Copy link
Contributor

@taklwu , thanks for the update! I will review it in details tomorrow.

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@taklwu , thanks for the update! Please see the comments inlined.

Comment on lines +28 to +29
public static final AttributeKey<String> ATTR_CLIENT_INVOCATION_ID =
AttributeKey.stringKey("raft.client.invocation.id");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be "raft.client.id".

import static org.apache.ratis.conf.ConfUtils.getBoolean;
import static org.apache.ratis.conf.ConfUtils.setBoolean;

public interface TraceConfigKeys {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's have this class temporarily. We should combine it with RaftServerConfigKyes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at the beginning it was part of the RaftServerConfigKeys in ratis-server, but I was thinking to have this configuration on the client as well, such I moved it to the ratis-common.

anyhow, as you said let's keep it temporarily.

Comment on lines +28 to +31
public static final AttributeKey<String> ATTR_CLIENT_INVOCATION_ID =
AttributeKey.stringKey("raft.client.invocation.id");
public static final AttributeKey<String> ATTR_MEMBER_ID = AttributeKey.stringKey("raft.member.id");
public static final AttributeKey<String> ATTR_CALL_ID = AttributeKey.stringKey("raft.call.id");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the fields are already in RatisAttributes, let remove the ATTR_ prefix, i.e.

  • RatisAttributes.MEMBER_ID
  • RatisAttributes.CLIENT_ID
  • RatisAttributes.CALL_ID

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 the change looks good.

@szetszwo szetszwo merged commit d132b18 into apache:master Mar 12, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants