feat(xds): implement xDS subscription worker by YutaoMa · Pull Request #2478 · hyperium/tonic

YutaoMa · 2026-01-12T18:34:41Z

Motivation

With #2475 transport and codec change merged, the remaining change required to get the xDS workflow work end to end is to wire them together with XdsClient through a worker loop. This PR implements that.

Solution

Implement AdsWorker, a transport/runtime/codegen-agnostic event loop for managing xDS subscriptions and ADS stream.

The worker conceptually manages a pair of mpsc channel, where the sender is used by XdsClient to send subscription requests and the receiver is used by TransportStream to send DiscoveryRequest to xDS servers.
When the underlying ADS stream closes, retry with exponential backoff is supported. Configurable via ClientConfig.

Implement ResourceWatcher and wire it with XdsClient so user can now subscribes to xDS resources.

Some design choice highlights:

Created a DecodedResource that is a type-erased representation of xDS resource with its decoding function carried in a closure. AdsWorker sends and receives this type on channels so it can stay transport and codec generic.
The ADS stream connection waits for the first subscription from the user. This is because tonic's gRPC stream ::connect() awaits for the response headers. Depending on the xDS server implementation, it may not respond back with headers until the first subscription, creating a deadlock if we await for the stream creation before sending any requests. (Btw grpc-go works around this by having send/recv in different go routines, here I kept both in the same worker loop to reduce the shared state complexity)

Testing

Created a basic.rs example to showcase the user experience. I've used it to test against a local xDS management server and successfully subscribed to multiple Listener resources.

=== xds-client Example ===

Connecting to xDS server: https://[redacted-private-server]
Connected!

Enter listener names to watch (one per line, Ctrl+C to exit):
(Use empty string for wildcard subscription)

[redacted-listener-1]
→ Watching for Listener: '[redacted-listener-1]'
✓ Listener received:
  name:        [redacted-listener-1]
  rds_config:  [redacted-route-1]

[redacted-listener-2]
→ Watching for Listener: '[redacted-listener-2]'
✓ Listener received:
  name:       [redacted-listener-2]
  rds_config:  [redacted-route-2]

✓ Listener received:
  name:        [redacted-listener-1]
  rds_config:  [redacted-route-1]

Next Steps

The current implementation completes the basic functionality end to end, but have these improvements opportunities:

xds-client currently do not cache received resources. This means new watchers to a subscribed resource need to wait for the next response from xDS server. If we add in resource caching, new watchers get instant response.
If we bring in a proper xDS server implementation we can run integration test in CI.
Observability. We'll bring in tracing, logging and metrics support.
Delta xDS. The current version implements State of the World variant of xDS only, but in some use cases delta variant is more efficient and scalable.

Additionally, these features are xDS-related but are also gRPC-specific so we have plan to implement them in a separate tonic-xds crate, using this xds-client crate:

Cascadingly subscribes to RDS, CDS, and EDS resources from a top-level Listener, for gRPC routing and load balancing.
xDS server connection bootstrapping, this includes TLS configuration, connecting pooling, multi-server fallback, etc.

xds-client/src/client/watch.rs

xds-client/src/client/worker.rs

xds-client/examples/basic.rs

xds-client/src/client/mod.rs

xds-client/src/client/watch.rs

xds-client/src/error.rs

xds-client/src/client/worker.rs

xds-client/src/client/mod.rs

arjan-bal

Left some initial comments, still going through the PR.

xds-client/examples/basic.rs

xds-client/src/client/config.rs

xds-client/src/message.rs

xds-client/src/client/config.rs

xds-client/examples/basic.rs

xds-client/src/client/worker.rs

xds-client/examples/basic.rs

YutaoMa · 2026-01-22T00:32:29Z

Addressed all prev comments in the latest commit @arjan-bal @dfawley

arjan-bal · 2026-01-22T08:12:40Z

xds-client/src/resource/mod.rs

@@ -67,3 +69,65 @@ pub trait Resource: Sized + Send + Sync + 'static {
    /// The resource name combined with the type URL uniquely identifies a resource.
    fn name(&self) -> &str;


There is another piece of configuration required to indicate whether a resource missing in the ADS response should be interpreted as a removal. gRPC Go stores this as a boolean.

Since RDS resource names are derived from LDS resources and EDS resource names from CDS resources, a missing RDS or EDS resource in the ADS response is not considered a removal. Consequently, the watcher will not receive a ResourceError.

There's some more information in the envoy docs: https://www.envoyproxy.io/docs/envoy/latest/api-docs/xds_protocol#knowing-when-a-requested-resource-does-not-exist

Noted, adding that right now

xds-client/src/client/config.rs

YutaoMa · 2026-01-22T21:23:19Z

Note for self and reviewer: based on gRFC A88 https://github.com/grpc/proposal/blob/master/A88-xds-data-error-handling.md, the ResourceEvent needs to be changed to include the ResourceError as an variant of ResourceUpdated, and we'll need to support the server feature regarding timeout and deletion. I plan to implement these as follow ups as long as they don't block A27 usage.

arjan-bal · 2026-01-29T06:22:38Z

xds-client/src/client/worker.rs

+    /// Wildcard subscription - receive all resources of this type.
+    /// In xDS protocol, this is represented by an empty resource_names list.
+    Wildcard,


Out of curiosity, what is the specific use case for requesting all resources? We haven't encountered a need for this in our other gRPC implementations yet.

What I know of is

Envoy will always use wildcard subscriptions for Listener and Cluster resources

For gRPC I haven't seen anyone done it.

I see. gRPC likely won't need this since it only subscribes to specific listener resources, whereas Envoy must intercept traffic for all services. If there are no known users of this feature, it might make sense to skip supporting it to keep the implementation simpler.

xds-client/src/client/worker.rs

xds-client/examples/basic.rs

YutaoMa · 2026-02-03T00:55:44Z

@arjan-bal re-requested your review, I've implemented the following changes in recent commits according to your comments:

1. `Resource` API change
	1. Added `ALL_RESOURCES_REQUIRED_IN_SOTW` flag
	2. Split up `decode` into `deserialize` and `validate`, specifically allow for name extraction even on failed validation
2. Added resource cache
3. Use references in `DiscoveryRequest`
4. Send resource validation error to specific watcher
5. `Config` API change
	1. Added `non_exhaustive`
	2. Added servers list config and `TransportFactory`
6. Style changes
	1. enum for Wildcard case in WatcherEntry
	2. Return `Result<(), Error>` for `run_connected`

arjan-bal

LGTM! I’ve left one primary comment regarding the cancellation of background timers; otherwise, I just have a few nits.

We should aim to improve coverage with end-to-end tests in the future. I expect that will happen once the client is integrated into the Tower layers for Tonic compatibility.

Thanks again—this is a significant contribution!

arjan-bal · 2026-02-05T10:01:27Z

xds-client/src/client/config.rs

+    /// let config = ClientConfig::new(node, "https://xds.example.com:443")
+    ///     .with_resource_initial_timeout(None);
+    /// ```
+    pub fn with_resource_initial_timeout(mut self, timeout: Option<Duration>) -> Self {


nit: Should with_resource_initial_timeout avoid taking an Option? If a caller doesn't want to set a timeout, they can just forgo calling the method entirely.

I suppose it could still be useful to offer a way to clear timeout if set.

arjan-bal · 2026-02-05T10:30:35Z

xds-client/src/client/worker.rs

+        self.runtime.spawn(async move {
+            runtime.sleep(timeout).await;
+            let _ = command_tx.unbounded_send(WorkerCommand::ResourceTimerExpired {
+                type_url: type_url_owned,
+                name,
+            });
+        });


We should ensure the spawned task is cancelled upon resource receipt so we don't leak any futures.

Added a oneshot channel based cancellation mechanism.

LucioFranco

Overall looks great, I don't have much feedback on the specific xDS stuff but we need to switch over to using the tokio crates for the majority of things. Major reason being is they are much better maintained and actually have support. As well as having much better correctness.

Additionally, looking at some of the code I wonder if we could wire up tests with https://github.com/tokio-rs/turmoil/ to test the retry behavior etc.

LucioFranco · 2026-02-06T15:12:36Z

xds-client/Cargo.toml

+    "net",
+] }
+tonic = { version = "0.14", features = ["tls-ring"] }
+clap = { version = "4", features = ["derive"] }


clap is a really heavy dep we should try to keep it out of the tree and just do the env var parsing manually?

LucioFranco · 2026-02-06T15:14:20Z

xds-client/src/client/config.rs

+#[non_exhaustive]
+pub struct ServerConfig {
+    /// URI of the management server (e.g., "https://xds.example.com:443").
+    pub uri: String,


Even though we have non_exhaustive set changing the field type (which we may want to do in the future) will be breaking. I suggest that we keep String as an impl detail and expose it as a &str which allows us to move to say an Arc in the future if we want. I would apply this same rule to all public facing types (if its not public facing prob best to make it pub(crate)).

LucioFranco · 2026-02-06T15:14:42Z

xds-client/src/client/mod.rs

@@ -1,74 +1,160 @@
 //! Client interface through which the user can watch and receive updates for xDS resources.

+use futures::channel::mpsc;


We need to use tokio here rather than the futures channel

LucioFranco · 2026-02-06T15:15:20Z

xds-client/src/client/mod.rs

+    /// This spawns a background task that manages the ADS stream.
+    /// The task runs until all `XdsClient` handles are dropped.
+    pub fn build(self) -> XdsClient {
+        let (command_tx, command_rx) = mpsc::unbounded();


I think we want to always avoid an unbounded queue? I know feedback loops can get hard here but I think there is always a way to set an upper bound.

LucioFranco · 2026-02-06T15:17:16Z

xds-client/src/client/mod.rs

+        let watcher_id = WatcherId::new();
+        let (event_tx, event_rx) = mpsc::channel(WATCHER_CHANNEL_BUFFER_SIZE);
+
+        let decoder: DecoderFn = Box::new(|bytes| match crate::resource::decode::<T>(bytes) {


Whats the advantage of doing a boxdn Fn over an actual trait and trait object?

LucioFranco · 2026-02-06T15:17:36Z

xds-client/src/client/watch.rs

+use std::sync::Arc;
+
+use futures::channel::{mpsc, oneshot};
+use futures::StreamExt;


need to use tokio_stream

LucioFranco · 2026-02-06T15:19:29Z

xds-client/src/client/worker.rs

+
+use bytes::Bytes;
+use futures::channel::{mpsc, oneshot};
+use futures::{FutureExt, SinkExt, StreamExt};


@dfawley and I have discussed this but I would like us to also avoid Sink at all costs right now. Its not a good abstraction for many reasons. We can discuss aternative approaches on tuesday or I can break out some time to meet with yall on it.

LucioFranco · 2026-02-06T15:25:06Z

xds-client/src/client/worker.rs

+    /// Returns `Err` if an error occurred and the worker should reconnect.
+    async fn run_connected<S: TransportStream>(&mut self, mut stream: S) -> Result<()> {
+        loop {
+            futures::select! {


This needs to use tokio::select! instead of the futures crates

YutaoMa added 4 commits January 9, 2026 13:32

feat(xds): implement subscription worker

a8e98c0

feat(xds): basic xds-client usage example

4cdc171

fix(xds): deadlock by ADS stream needing response headers

4e28aa1

fix(xds): reset state upon stream reconnect

5d1320b

YutaoMa marked this pull request as ready for review January 12, 2026 18:43

ankurmittal reviewed Jan 12, 2026

View reviewed changes

xds-client/src/client/watch.rs Outdated Show resolved Hide resolved

ankurmittal reviewed Jan 12, 2026

View reviewed changes

xds-client/src/client/worker.rs Outdated Show resolved Hide resolved

fix(xds): enable io-std in tokio

b42de40

YutaoMa requested review from LucioFranco, dfawley and gu0keno0 January 13, 2026 00:35

gu0keno0 reviewed Jan 13, 2026

View reviewed changes

xds-client/examples/basic.rs Outdated Show resolved Hide resolved

gu0keno0 reviewed Jan 13, 2026

View reviewed changes

xds-client/examples/basic.rs Outdated Show resolved Hide resolved

xds-client/examples/basic.rs Outdated Show resolved Hide resolved

xds-client/src/client/mod.rs Outdated Show resolved Hide resolved

xds-client/src/client/watch.rs Outdated Show resolved Hide resolved

gu0keno0 reviewed Jan 13, 2026

View reviewed changes

xds-client/src/error.rs Outdated Show resolved Hide resolved

gu0keno0 reviewed Jan 13, 2026

View reviewed changes

xds-client/src/client/worker.rs Show resolved Hide resolved

gu0keno0 reviewed Jan 13, 2026

View reviewed changes

xds-client/src/client/worker.rs Show resolved Hide resolved

gu0keno0 reviewed Jan 13, 2026

View reviewed changes

xds-client/src/client/worker.rs Outdated Show resolved Hide resolved

YutaoMa added 5 commits January 14, 2026 11:40

feat(xds): address worker PR comments

1598012

feat(xds): simplify connect loop

dd62e0f

feat(xds): simplify basic example with clap

fa0b99a

fix(xds): address Clippy warnings

e2e787d

Merge branch 'master' into yutaoma/xds-client-worker

edd4047

dfawley assigned arjan-bal Jan 15, 2026

gu0keno0 reviewed Jan 16, 2026

View reviewed changes

xds-client/src/client/worker.rs Outdated Show resolved Hide resolved

gu0keno0 approved these changes Jan 16, 2026

View reviewed changes

style(xds): unify handle_command

ee9c17b

gu0keno0 reviewed Jan 16, 2026

View reviewed changes

xds-client/src/client/worker.rs Outdated Show resolved Hide resolved

gu0keno0 reviewed Jan 16, 2026

View reviewed changes

xds-client/src/client/worker.rs Outdated Show resolved Hide resolved

xds-client/src/client/mod.rs Show resolved Hide resolved

arjan-bal self-requested a review January 16, 2026 05:45

arjan-bal reviewed Jan 19, 2026

View reviewed changes

dfawley reviewed Jan 21, 2026

View reviewed changes

xds-client/examples/basic.rs Outdated Show resolved Hide resolved

xds-client/examples/basic.rs Outdated Show resolved Hide resolved

YutaoMa added 2 commits January 21, 2026 16:30

address PR comments

99c18ec

Merge branch 'master' into yutaoma/xds-client-worker

882af5b

arjan-bal reviewed Jan 22, 2026

View reviewed changes

arjan-bal reviewed Jan 29, 2026

View reviewed changes

arjan-bal removed their assignment Feb 2, 2026

YutaoMa added 6 commits February 2, 2026 11:46

xds: address PR comments

9d40c60

xds: separate deserialize and validate stage

871a1c2

xds: add initial Requested state

093d6c9

xds: add TransportBuilder

4b89ac5

style(xds): fmt

2d77394

Merge branch 'master' into yutaoma/xds-client-worker

9f09284

YutaoMa requested a review from arjan-bal February 3, 2026 00:50

YutaoMa added 2 commits February 3, 2026 10:56

xds: add resource watch timer

55062f9

style(xds): fmt

b94bfa1

YutaoMa assigned arjan-bal Feb 3, 2026

arjan-bal approved these changes Feb 5, 2026

View reviewed changes

arjan-bal assigned YutaoMa and unassigned arjan-bal Feb 5, 2026

YutaoMa added 2 commits February 5, 2026 13:37

xds: cancellation for resource timer

d32187e

xds: fix two logic bugs

6c37b52

LucioFranco requested changes Feb 6, 2026

View reviewed changes

		@@ -67,3 +69,65 @@ pub trait Resource: Sized + Send + Sync + 'static {
		/// The resource name combined with the type URL uniquely identifies a resource.
		fn name(&self) -> &str;

		@@ -1,74 +1,160 @@
		//! Client interface through which the user can watch and receive updates for xDS resources.

		use futures::channel::mpsc;

Conversation

YutaoMa commented Jan 12, 2026

Motivation

Solution

Testing

Next Steps

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arjan-bal left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

YutaoMa commented Jan 22, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

YutaoMa commented Jan 22, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

YutaoMa commented Feb 3, 2026

Uh oh!

arjan-bal left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

LucioFranco left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!