Skip to content

xds: reuse GrpcXdsTransport and underlying gRPC channel to the same xDS server by ref-counting#12682

Open
danielzhaotongliu wants to merge 5 commits intogrpc:masterfrom
danielzhaotongliu:ref-count-grpc-xds-channel
Open

xds: reuse GrpcXdsTransport and underlying gRPC channel to the same xDS server by ref-counting#12682
danielzhaotongliu wants to merge 5 commits intogrpc:masterfrom
danielzhaotongliu:ref-count-grpc-xds-channel

Conversation

@danielzhaotongliu
Copy link
Collaborator

@danielzhaotongliu danielzhaotongliu commented Mar 10, 2026

This PR implements reusing the gRPC xDS transport (and underlying gRPC channel) to the same xDS server by ref-counting, which is already implemented in gRPC C++ (link) and gRPC Go (link). This optimization is expected to reduce memory footprint of the xDS management server and xDS enabled clients as channel establishment and lifecycle management of the connection is expensive.

  • Implemented a map to store GrpcXdsTransport instances keyed by the Bootstrapper.ServerInfo and each GrpcXdsTransport has a ref count. Note, the map cannot be simply keyed by the xDS server address as the client could have different channel credentials to the same xDS server, which should be counted as different transport instances.
  • When GrpcXdsTransportFactory.create() is called, the existing transport is reused if it already exists in the map and increment its ref count, otherwise create a new transport, store it in the map, and increment its ref count.
  • When GrpcXdsTransport.shutdown() is called, its ref count is decremented and the underlying gRPC channel is shut down when its ref count reaches zero.
  • Note this ref-counting of the GrpcXdsTransport is different and orthogonal to the ref-counting of the xDS client keyed by the xDS server target name to allow for xDS-based fallback per gRFC A71.

Prod risk level: Low

  • Reusing the underlying gRPC channel to the xDS server would not affect the gRPC xDS (ADS/LRS) streams which would be multiplexed on the same channel, however, this means new xDS (ADS/LRS) streams and RPCs may fail due to hitting the limit of MAX_CONCURRENT_STREAMS.

Tested:

  • Verified end-to-end with a xDS enabled gRPC Java client communicating to multiple different gRPC backend servers behind different targets using the xDS management server for name resolution and endpoint discovery. Verified gRPC xDS transport creation, ref-counting, reuse, shutdown, deletion from map when ref count is zero all worked as expected.

Implementation details / context:

  • Used java.util.concurrent.ConcurrentHashMap APIs compute and computeIfPresent where the entire method invocation is performed atomically to achieve a concurrent and thread-safe solution which follows Java best practices.

Alternatives considered:

  • Write own synchronization logic with synchronized block and locks. After discussion internally, it was preferred to use existing concurrency libraries which is less error-prone and should offer better performance.

@danielzhaotongliu danielzhaotongliu changed the title xds: reuse GrpcXdsTransport and channel to the same xDS server by ref-counting xds: reuse GrpcXdsTransport and underlying gRPC channel to the same xDS server by ref-counting Mar 10, 2026
@danielzhaotongliu danielzhaotongliu marked this pull request as ready for review March 11, 2026 00:28
Copy link
Collaborator Author

@danielzhaotongliu danielzhaotongliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to push comments yesterday.
Addressed latest round of comments as well.

Copy link
Member

@ejona86 ejona86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@ejona86 ejona86 added the kokoro:run Add this label to a PR to tell Kokoro the code is safe and tests can be run label Mar 12, 2026
@grpc-kokoro grpc-kokoro removed the kokoro:run Add this label to a PR to tell Kokoro the code is safe and tests can be run label Mar 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants