feat(routeFromHar): add interceptAPIRequests option#41294
Conversation
Make `BrowserContext.routeFromHAR` also intercept requests issued via APIRequestContext (`page.request.*` / `context.request.*`) when the new `interceptAPIRequests: true` option is set. Defaults to `false` so existing behavior is unchanged. Under the hood, `HarBackend.lookup` gains an `apiRequestOnly` flag that filters matches to entries with `_apiRequest: true` (already written by the HAR recorder), so API-side replay never picks up a browser-side recording for the same URL. Fixes microsoft#22869
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
@dcrousso Could you please take a look? |
dcrousso
left a comment
There was a problem hiding this comment.
thanks so much for taking the time to submit a PR!
this is a really great start, but i think it's missing a few things in order to fully work
please let me know if any of my comments are unclear
| return { | ||
| body: lookupResult.body ?? Buffer.from(''), | ||
| log, | ||
| response: { | ||
| url: urlString, | ||
| status: lookupResult.status ?? 0, | ||
| statusText: '', | ||
| headers: lookupResult.headers ?? [], | ||
| }, | ||
| }; |
There was a problem hiding this comment.
i dont think this handles other side-effects of API requests (e.g. setting cookies)
There was a problem hiding this comment.
Done — the HAR path now parses set-cookie from the matched response and runs the same addCookies (with the per-cookie retry fallback) as the live path. Covered by should apply set-cookie side-effects from intercepted APIRequestContext requests.
| if (postData) | ||
| setHeader(headers, 'content-length', String(postData.byteLength)); | ||
| const { body, log, response } = await this._sendRequestWithRetries(progress, requestUrl, options, postData, params.maxRetries); | ||
| const harResponse = await this._lookupInHar(progress, requestUrl, method, headers, postData); |
There was a problem hiding this comment.
i think this will prevent the dispatch of APIRequestContext.Events.Request and APIRequestContext.Events.RequestFinished meaning that if a new recording is captured while replaying from the HAR then it wont include any previously captured API requests
There was a problem hiding this comment.
Good catch. The short-circuit bypassed _sendRequest, which is the only emitter of Request/RequestFinished. The HAR path now emits both events (mirroring _sendRequest), so capturing a new recording while replaying still includes the API requests. Added a test (should re-record intercepted APIRequestContext requests into a new HAR).
| response: { | ||
| url: urlString, | ||
| status: lookupResult.status ?? 0, | ||
| statusText: '', |
There was a problem hiding this comment.
It was empty because the first cut only carried status. Now populated from entry.response.statusText (plus securityDetails/serverAddr per your other comment).
| body: lookupResult.body ?? Buffer.from(''), | ||
| log, | ||
| response: { | ||
| url: urlString, |
There was a problem hiding this comment.
in the case of a redirect, the value of url holds final endpoint URL instead of original requested URL, so i think we need to modify HarBackend.prototype.lookup to also include the url of the entry
There was a problem hiding this comment.
Updated HarBackend.lookup to also return the matched entry.request.url, and the response url now uses it, so it reflects the final endpoint after a redirect chain rather than the originally-requested URL.
| notFound: options.notFound, | ||
| baseURL: options.baseURL, | ||
| }; | ||
| this._harForAPIRequests.push(registration); |
There was a problem hiding this comment.
Page.prototype.route/BrowserContext.prototype.route give priority to the newest route (i.e. unshift) instead of the oldest
There was a problem hiding this comment.
Changed to .unshift() so the newest registration wins, matching Page.route/BrowserContext.route.
| const harResponse = await this._lookupInHar(progress, requestUrl, method, headers, postData); | ||
| let body: Buffer; | ||
| let log: string[]; | ||
| let response: Omit<channels.APIResponse, 'fetchUid'>; | ||
| if (harResponse) | ||
| ({ body, log, response } = harResponse); | ||
| else | ||
| ({ body, log, response } = await this._sendRequestWithRetries(progress, requestUrl, options, postData, params.maxRetries)); |
There was a problem hiding this comment.
i think we can simplify this
const { body, log response } = (await this._lookupInHar(progress, requestUrl, method, headers, postData)) || (await this._sendRequestWithRetries(progress, requestUrl, options, postData, params.maxRetries)));There was a problem hiding this comment.
Applied your suggestion: const { body, log, response } = (await this._lookupInHar(...)) || (await this._sendRequestWithRetries(...));. Both paths return the same SendRequestResult shape.
| log.push(`HAR: ${lookupResult.message ?? 'lookup failed'}`); | ||
| continue; | ||
| } | ||
| if (lookupResult.action === 'noentry') { |
There was a problem hiding this comment.
NIT: 'missing' would be a better name here
There was a problem hiding this comment.
I kept 'noentry' here — the action is part of the LocalUtilsHarLookupResult wire protocol (localUtils.yml), so renaming it would be a breaking change for cross-version client/server compatibility. Happy to rename if you'd still prefer it and are OK treating it as a protocol change.
| return { | ||
| body: lookupResult.body ?? Buffer.from(''), | ||
| log, | ||
| response: { |
There was a problem hiding this comment.
what about securityDetails and serverAddr?
There was a problem hiding this comment.
Added — both are now returned from lookup (entry._securityDetails, entry.serverIPAddress/entry._serverPort) and mapped onto the response.
| const urlMatch = this._options.urlMatch; | ||
| const { registrationId } = await context._channel.harForAPIRequestsStart({ | ||
| har, | ||
| urlGlob: isString(urlMatch) ? urlMatch : undefined, | ||
| urlRegexSource: isRegExp(urlMatch) ? urlMatch.source : undefined, | ||
| urlRegexFlags: isRegExp(urlMatch) ? urlMatch.flags : undefined, | ||
| notFound: this._notFoundAction, | ||
| }); | ||
| this._apiRequestRegistrations.push({ context, registrationId }); |
There was a problem hiding this comment.
is there a way to use a RouteHandler like addContextRoute/addPageRoute since it seems like _handle might already be able to do a lot of what you're adding in the backend (i.e. loading and looking up in a .har)?
There was a problem hiding this comment.
RouteHandler/_handle fulfills a browser Route, which only exists for traffic that reaches the browser/CDP. APIRequestContext issues requests directly over Node http/https (_sendRequest) and never creates a Route, so _handle can't intercept it — this why #11502 added har recording/tracing ability to APIRequestContext separately.
| for (const registration of registrations) { | ||
| if (!urlMatches(registration.baseURL, urlString, registration.urlMatch)) | ||
| continue; | ||
| const lookupResult = await progress.race(registration.harBackend.lookup(urlString, method, headersArray, postData, false, { apiRequestOnly: true })); |
There was a problem hiding this comment.
i think we may need to pass maxRedirects into this so that it stops if the limit is reached and throws an error if its exceeded instead of giving undefined
There was a problem hiding this comment.
maxRedirects is now threaded into lookup → _harFindResponse. It mirrors the live path: a negative limit means "don't follow" (returns the 3xx entry as-is), and exhausting the budget throws Max redirect count exceeded instead of returning undefined. Added should throw when intercepted APIRequestContext request exceeds maxRedirects.
Make the HAR-replay path for APIRequestContext behave like the live network path and tighten the server-side plumbing: - Emit Request/RequestFinished events so a recording captured while replaying still includes the API requests. - Apply set-cookie side-effects via addCookies, like the live path. - Honor maxRedirects (throw when exceeded) and report the final entry URL on redirects. - Populate statusText, securityDetails and serverAddr on the response; log via progress.log alongside the in-memory log. - Reuse the already-open HarBackend (looked up by harId) instead of opening a second backend for the same HAR file. - Rename to routeAPIRequestsFromHar/unrouteAPIRequestsFromHar, give the newest registration priority (unshift), and fold the registration bookkeeping into the dispatcher's _disposables. Fixes: microsoft#22869
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
# Conflicts: # packages/playwright-core/src/client/harRouter.ts
Test results for "MCP"7380 passed, 1122 skipped Merge workflow run. |
Test results for "tests 1"2 failed 3 flaky49151 passed, 1142 skipped Merge workflow run. |
Fixes #22869
Summary
interceptAPIRequestsoption toBrowserContext.routeFromHARsopage.request.*/context.request.*calls are also served from the HAR file.false— fully backward compatible._apiRequest: true(already written by the HAR recorder), so browser-side recordings are never served to API requests for the same URL.