Add redirect extra headers#5631
Draft
ankur22 wants to merge 21 commits intofix/extra-emit-of-request-metricsfrom
Draft
Add redirect extra headers#5631ankur22 wants to merge 21 commits intofix/extra-emit-of-request-metricsfrom
ankur22 wants to merge 21 commits intofix/extra-emit-of-request-metricsfrom
Conversation
This function will be used to parse the incoming headers from extraInfo
Registers our interest to receive extraInfo data async from chromium.
...extra headers from extraInfo asynchronously.
This is the key to being able to track all the extraInfo headers that arrive for each of the redirect requests/responses. The initial request will have async extraInfo headers under the same requestID. Subsequent redirects will also have the same requestID. So extraInfoTracker helps to keep track of each redirect's extraInfo header.
Since it already tracks the requests, this feels like a natural place for the extraInfoTracker to live.
...request and response. This will be matched up with the corresponding request or response. When there's a match the request/response will have its extra headers set.
This covers two cases: 1. A non-redirecting request, so only processing of a request. 2. A redirecting request which will process a request and a response.
This now takes into account whether there are extra headers before calculating the headers size.
...extra headers in request.
...extra headers.
...extra headers which may or may not be present.
fca34be to
7d765fe
Compare
8 tasks
Since the frame URL is updated asynchronously, we cannot rely on a deterministic way of asserting on it. It is being ignored for now.
1951bd5 to
e7fc161
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What?
Introduces an index-based
extraInfoTrackerto correctly pair CDPExtraInfoevents with their correspondingRequest/Responseobjects, replacing the previous approach that incorrectly merged extra headers across redirect chains.It also enables and registers the handlers for the
ExtraInfowhich contain the raw headers.Why?
Raw headers vs provisional headers
CDP provides two sets of headers for every network request/response:
responseReceived/requestWillBeSent): refined by Chrome's internal processing — may omit cookies, rewrite header names, etc.responseReceivedExtraInfo/requestWillBeSentExtraInfo): exactly what was sent/received over the network, including cookies,Host, etc.The raw headers are what
allHeaders()should return.Two independent "channels"
These two sets of events are not synchronised. This isn't an explicit concept in the CDP protocol — there's no "channel" field in the payload. It's an observed behavioural pattern in how Chrome's internals emit events. The CDP docs confirm this:
The main network stack emits
requestWillBeSent,responseReceived,loadingFinished, etc. A separate layer (closer to the wire/transport) emits theExtraInfovariants. Because they come from different parts of Chrome's architecture, they arrive independently. The only thing linking them is the sharedrequestId.Redirect chains make it harder
During a redirect chain (e.g.
/→/r/3→/r/2→/final), Chrome reuses the samerequestIdfor the entire chain. Each hop produces its own pair of events on both "channels":sequenceDiagram participant Browser participant Main as Main events participant Extra as ExtraInfo events Note over Main,Extra: requestId = "ABC123" reused for entire chain Browser->>Extra: requestWillBeSentExtraInfo[0] (request headers for /) Browser->>Extra: responseReceivedExtraInfo[0] (302 response headers from /) Browser->>Main: requestWillBeSent (redirectResponse for /) Note over Main: creates Request[0] + Response[0] Browser->>Extra: requestWillBeSentExtraInfo[1] (request headers for /r/3) Browser->>Extra: responseReceivedExtraInfo[1] (302 response headers from /r/3) Browser->>Main: requestWillBeSent (redirectResponse for /r/3) Note over Main: creates Request[1] + Response[1] Browser->>Extra: requestWillBeSentExtraInfo[2] (request headers for /final) Browser->>Main: responseReceived (200 from /final) Note over Main: creates Request[2] + Response[2] Browser->>Extra: responseReceivedExtraInfo[2] (200 response headers from /final) Browser->>Main: loadingFinishedIndex-based pairing
Within each "channel", events for a given
requestIdarrive in order. So the i-thresponseReceivedExtraInfoalways corresponds to the i-th response for thatrequestId. TheextraInfoTrackeruses this invariant to pair by index position — the same approach Playwright uses. When either side arrives first, it queues up. As soon as both sides are available at the same index, the tracker patches the raw headers onto theResponseobject.Example
k6 script
Go server to test the change
Checklist
make check) and all pass.Checklist: Documentation (only for k6 maintainers and if relevant)
Please do not merge this PR until the following items are filled out.
Related PR(s)/Issue(s)
Closes: #4291