Skip to content

Conversation

@mbissa
Copy link
Contributor

@mbissa mbissa commented Feb 2, 2026

This resolves an issue where below log statement keeps printing repeatedly:
"ERROR: [otel-plugin] ctx passed into client side stats handler metrics event handling has no client attempt data present".
When new stream is created for health/orca producers, stats and tracing is not setup. However, this fact is ignored during RPC and an error logs is printed to denote that stats cannot be handled. We will enable stream to have its own reference to the stats handler and only process per RPC implementation when it is present (like in case of regular data streams).

Internal issue: b/385685802

RELEASE NOTES:

  • stats: only process RPC stats/tracing in health and ORCA producers if a handler is configured, preventing unnecessary error logging

@mbissa mbissa added this to the 1.79 Release milestone Feb 2, 2026
@mbissa mbissa requested a review from arjan-bal February 2, 2026 11:29
@codecov
Copy link

codecov bot commented Feb 2, 2026

Codecov Report

❌ Patch coverage is 60.00000% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.48%. Comparing base (7860144) to head (734cd3c).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
internal/transport/http2_client.go 53.84% 0 Missing and 6 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8874      +/-   ##
==========================================
- Coverage   83.13%   82.48%   -0.66%     
==========================================
  Files         414      414              
  Lines       32821    32824       +3     
==========================================
- Hits        27287    27075     -212     
- Misses       4098     4104       +6     
- Partials     1436     1645     +209     
Files with missing lines Coverage Δ
internal/transport/client_stream.go 82.22% <ø> (-17.78%) ⬇️
internal/transport/transport.go 89.06% <ø> (-2.09%) ⬇️
stream.go 81.87% <100.00%> (ø)
internal/transport/http2_client.go 72.49% <53.84%> (-20.16%) ⬇️

... and 19 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mbissa mbissa requested a review from easwars February 2, 2026 11:55
@arjan-bal arjan-bal added the Area: Client Includes Channel/Subchannel/Streams, Connectivity States, RPC Retries, Dial/Call Options and more. label Feb 2, 2026
@mbissa mbissa assigned easwars and unassigned easwars and arjan-bal Feb 2, 2026
@easwars
Copy link
Contributor

easwars commented Feb 3, 2026

Curious, how did you run into this? Is there an issue for this?

@easwars
Copy link
Contributor

easwars commented Feb 3, 2026

When new stream is created for health/orca producers, stats and tracing is not setup

Can you please elaborate on this?

My understanding is that when the opentelemtry.DialOption is specified at channel creation time, the gRPC channel is configured with the oTel stats handler. And this is stored in the cc.statsHandler field which is what is what gets propagated to the RPC attempt struct csAttempt in newAttemptLocked(). So, this information should also be available when creating the health or ORCA stream, right? Or is it the case that statstracing should be disabled on these streams?

I think it would be good to open an issue that documents all the details and the approach taken (and why this is the preferred approach, if there are other approached available at all). Thanks.

}

s, err := as.transport.NewStream(as.ctx, as.callHdr)
s, err := as.transport.NewStream(as.ctx, as.callHdr, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this how the health and ORCA streams get a nil stats handler? We want to eliminate this duplication between the regular client stream and the non-retry client stream at some point. We tried to prioritize that work for Q1, but it didn't happen. This has been a maintenance burden for a while now, and a source of hard to find bugs.

@easwars easwars assigned mbissa and unassigned easwars Feb 3, 2026
@mbissa
Copy link
Contributor Author

mbissa commented Feb 10, 2026

Just did a rebase with master to fast forward my commit and address the vet-proto failure.

@mbissa mbissa force-pushed the fix-spammy-stats-log-master branch from dde0677 to 734cd3c Compare February 10, 2026 04:30
@arjan-bal
Copy link
Contributor

Just FYI, no action needed. I was considering whether we'd need to pass MetricsRecorderList separately to skip metrics recording. After reviewing the code for references, I found that MetricsRecorderList is only stored in the ClientConn struct and used for subchannel metrics. Stream metrics, on the other hand, are recorded via calls to the stats handlers.

func (h *clientMetricsHandler) HandleRPC(ctx context.Context, rs stats.RPCStats) {
ri := getRPCInfo(ctx)
if ri == nil {
logger.Error("ctx passed into client side stats handler metrics event handling has no client attempt data present")
return
}
h.processRPCEvent(ctx, rs, ri.ai)
}

Consequently, this change should also eliminate the error log from the OTel metrics handler as well. As before, per-call metrics will not be recorded for RPCs initiated by producers.

@mbissa mbissa assigned easwars and unassigned mbissa Feb 10, 2026
@easwars
Copy link
Contributor

easwars commented Feb 10, 2026

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors how RPC stats and tracing are handled to resolve an issue with repetitive error logging for health and ORCA producers. The change introduces a per-stream stats.Handler, moving away from a transport-level handler for RPC events. This allows streams like those for health checks, which don't have stats configured, to bypass stats processing, thus preventing the erroneous log messages. The implementation is clean and effective: the stats.Handler is now part of the ClientStream struct and is passed during stream creation. The NewStream function signature has been updated across the transport layer, and call sites have been correctly modified. For regular RPCs, the existing stats handler is plumbed through, while for internal streams like health checks, nil is passed, correctly disabling stats for them. The changes are logical and well-executed. This is a good improvement to reduce log noise and make stats handling more precise.

Copy link
Contributor

@easwars easwars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add a test that verifies that the offending log is not seen for health/ORCA streams?

headerChanClosed uint32 // set when headerChan is closed. Used to avoid closing headerChan multiple times.
bytesReceived atomic.Bool // indicates whether any bytes have been received on this stream
unprocessed atomic.Bool // set if the server sends a refused stream or GOAWAY including this stream
statsHandler stats.Handler
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please add a short trailing comment explaining when this can be nil? Thanks.

}

s, err := as.transport.NewStream(as.ctx, as.callHdr)
s, err := as.transport.NewStream(as.ctx, as.callHdr, nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please add a comment here as to why we are setting it to nil here, so that future readers dont have to second guess themselves.


// NewStream creates a Stream for an RPC.
NewStream(ctx context.Context, callHdr *CallHdr) (*ClientStream, error)
NewStream(ctx context.Context, callHdr *CallHdr, handler stats.Handler) (*ClientStream, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that a stats.Handler field is stored in the transport now. Is it required anymore at all given that the stats handler is passed at stream creation time?

@easwars easwars assigned mbissa and unassigned easwars Feb 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: Client Includes Channel/Subchannel/Streams, Connectivity States, RPC Retries, Dial/Call Options and more. Type: Bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants