Skip to content

Add commands to collect and retrieve response bodies #877

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

juliandescottes
Copy link
Contributor

@juliandescottes juliandescottes commented Feb 13, 2025

Overview of what this PR aims to add:

  • Concept of network body collector A network body collector is a concept similar to intercepts and events subscriptions. Clients can add/remove collectors. In theory this should be used for both requests and responses but is only applied to responses in this PR. A network body collector is a struct with contexts or userContexts, and urlPatterns. All are optional so you can potentially define a collector which will match everything (to be discussed)

New BiDi session items:

  • BiDi session has a network body collector map, similar to the intercept map. Simply stores the active body collectors
  • BiDi session has a network maximum body size, js-uint to define the maximum size of collected bodies.
  • BiDi session has a network response map, which contains all the collected bodies, keyed by request id. This map is stored at session level because different sessions might have different configurations about what kind of network bodies can be collected (eg max size).

New commands:

  • new command addBodyCollector to add a new network body collector
  • new command removeBodyCollector to remove an existing network body collector
  • new command setNetworkBodyCollectorConfiguration, which can be used to set session's network maximum body size. In the future we might have more configuration available here, this is why this is setting a generic configuration.
  • also getResponseBody, but is mostly identical to the one in Add a command to get response body #856 . It defaults to base64 at the moment, we probably want to make it easier to receive the body as string if possible? (but I prefered to leave this command as close to the existing PR as possible)

New error:

  • new error no such body collector, for removeBodyCollector

Updates to existing commands

  • When a response is caught in network.responseCompleted, we attempt to collect the body if it is related to a navigable
  • On navigation committed we remove the bodies of all responses linked to this navigable
  • On context destroyed we also remove the bodies of all responses linked to this navigable

Note that I haven't added extra limitations to which responses are collected in responseCompleted, but we can definitely add them (eg no worker requests etc...)


Preview | Diff

@juliandescottes
Copy link
Contributor Author

@OrKoN @jgraham I was not sure how I could.(or if I could?) update PR #856 , so I just created a new one here.
Please take a look at the summary before looking at the patch, you might already have comments on the overview before diving into the details :)

@OrKoN
Copy link
Contributor

OrKoN commented Feb 13, 2025

Thanks for the PR. I think we do not have clear requirements that any clients need the functionality provided by addBodyCollector so we could exclude it for now (unless someone needs it?). At least I would not add browsing contexts params in the same way as we have it in event subscriptions (when context id resolves to the top-level traversable). I think we need an ability to define the overall size limit instead of (in addition?) a per-request limit in setBodyCollectorConfiguration (instead of just not saving the freshest request we should probably evict earlier requests).

@OrKoN
Copy link
Contributor

OrKoN commented Feb 13, 2025

Note that I haven't added extra limitations to which responses are collected in responseCompleted

I am thinking if in my initial draft I should have started collection in responseStarted (I think that would actually be required for interception use cases?)

@juliandescottes
Copy link
Contributor Author

Thanks for taking a look!

Thanks for the PR. I think we do not have clear requirements that any clients need the functionality provided by addBodyCollector so we could exclude it for now (unless someone needs it?). At least I would not add browsing contexts params in the same way as we have it in event subscriptions (when context id resolves to the top-level traversable).

I'll wait for feedback from James here, in case that doesn't align with his feedback from PR #856 , but I thought that was one of the main required changes? Having a way to clearly declare whether you want to record responses or not. And if we do I think it makes sense to make it consistent with all our other similar APIs (events and intercepts) (note: intercepts don't have user context support yet, but they really should).

I think we need an ability to define the overall size limit instead of (in addition?) a per-request limit in setBodyCollectorConfiguration (instead of just not saving the freshest request we should probably evict earlier requests).

Yeah I'm happy to update the configuration bit with a total size + FIFO approach to evict requests, let's see if there are any other requested flags/limits.

I am thinking if in my initial draft I should have started collection in responseStarted (I think that would actually be required for interception use cases?)

Maybe we should create the entry as early as beforeRequestSent, and have a "state" in the collected response (pending, available, evicted ...)

@juliandescottes
Copy link
Contributor Author

One thing I wanted to mention re: contexts/userContexts in addBodyCollector.

On our side, considering our current implementation, it is important to have an API where clients can be selective upfront about which requests they are interested in. To record responses, Firefox duplicates them in another (parent) process. Means it's easier for us to control the availability of responses, but we probably use up more memory than Chrome does.

On the client size, if you are only interested in one class of requests coming from a specific tab, if you can't define the contexts userContexts to watch, then you have to fiddle with the "total size" configuration hoping that the requests you are interested in are not going to be evicted first?

Puppeteer and other clients can still just call it without any argument in the beginning? But considering this API is consistent with our subscription and intercept APIs, and seems beneficial for clients, I would still like us to consider it.

@jgraham
Copy link
Member

jgraham commented Feb 17, 2025

I agree with Julian. Given that this feature has potentially high overhead in terms of resource usage it seems important to be able to turn it on and off with more granularity than just "did anyone subscribe to any network events and so could have a response id to allow reading the body" (even that optimization is a little hard because in theory as a pure optimisation one would need to keep the data around until the responseCompleted event was emitted just in case someone started an event subscription after the response was started and before it was completed. Of course one could specify that that doesn't work, but it would certainly be surprising to users if the general model is "I have a response id therefore I can get the response body").

I also agree that if we're adding a maximum total cache size it's even more important that you can be specific about which response bodies you're interested in.

My assumption is that test automation clients currently don't offer this kind of control because they have been based around CDP's semantics. It seems reasonable to me to assume tests generally know when they will want to access the network bodies and so an API based on opt-in is reasonable. We also know that devtools, for example, do collect network response bodies one tab at a time, and that that kind of use case would be severely compromised if there was only global control over retention (e.g. if I'm trying to record an interactive browsing session in one tab for later replay it's extremely important to me that everything in that tab ends up being accessible, and I actively don't want anything that happens in other tabs to affect it).

@OrKoN
Copy link
Contributor

OrKoN commented Feb 17, 2025

CDP does offer max size control per target (local subtree of frames). My point was mostly about limiting by URL patterns or max response body size not being seemingly too useful: most clients want everything in a tab even if they know what URL patterns or individual response sizes they are dealing with. I think the current API very easily allows scenarios like "oh I set max size per response body to 10KB so my 10.1KB response was not recorded and I need to re-run everything or I have not realized that I needed bodies for these URLs". Most users would just set something like 99999MB per response body and match all for URLs.

As for URL pattern matching, matching by media type even sounds more useful as the first filter. Do we have any specific clients interested in the fine-grain controls beyond per-navigable total limits? If not, I would propose to simplify the proposal by adding context ids to network.SetBodyCollectorConfiguration making it per context or global and changing maximumBodySize to maximumTotalBodySize (I believe most clients would just be using that and we can reduce the amount of specification and implementation needed without blocking an extension with fine-grained filtering in the future). That would require partitioning per navigable in the cache (or even per document) but it looks like we would need it for cleanup as well (if the we agree on the current cleanup points).

@jgraham
Copy link
Member

jgraham commented Feb 17, 2025

I agree that URL patterns could be dropped in the first pass, as long as we keep contexts and user contexts.

My concern with just having a maximum total size, and no other filtering, is the case where you have a page with a few large assets that you're not interested in, but which might cause cache eviction of the small assets you are interested in. For example on a media viewing page where you might have some pictures or videos that are hundreds of megabytes, when your test is entirely concerned with checking some data in a few kB of HTML or JSON.

Without a URL (or MIME type) filter we can't easily avoid the overhead of copying that data to the parent process (at least up to the size limit), but we can avoid requiring people to set the maximum cache size to the size of all the resources on the page rather than a per-resource limit of (say) 100kB.

@juliandescottes
Copy link
Contributor Author

Thanks for the feedback!

Trying to summarize where we are:

  • all ok with adding a maximumTotalBodySize to the configuration
  • all ok with dropping URL patterns in the first iteration
  • needs agreement about keeping maximumBodySize
  • needs agreement about the API (drop add/removeBodyCollector in favor of just having setBodyCollectorConfiguration)

My comments on this:

1/ For the URL patterns, I agree we can drop, but from our discussion it sounds like we want some way to exclude requests instead. Would excludedURLPatterns be more useful? Or do we want to design something to exclude requests based on specific fields of the network event eg mimeType, bodySize etc. In any case it sounds like we can keep it for a next iteration.

2/ For maximumBodySize:

I imagined this should be used to set a reasonably high (few MBs) limit to individual requests to avoid having the whole storage for response bodies taken up by just a few random requests (as mentioned by @jgraham ). In Firefox DevTools we have a cap for individual responses to avoid storing unreasonably large responses (1MB by default, can be changed with a pref). I think it's worth having an explicit limit, but maybe it should have a default value. And maybe that should rather be a capability. I would like to keep a clearly defined limit and allow clients to override it if needed. On Firefox side I don't think we can handle duplicating huge responses in the parent process for BiDi, we will have to implement a cap anyway.

3/ API: only add setBodyCollectorConfiguration (or another name :) )

I imagine the behaviour would be close to setCacheBehavior. When setting for global, it overrides all previously defined context/user context configurations. When setting for a context/user context it will potentially preserve the previous body collector configurations set for other contexts/user contexts. This brings some questions:

  • If you can set a maximum total size / maximum size at the same time, then this setting only applies to the context/user contexts provided in the command? Imagine you first set configuration for context "12" and set another configuration for userContext "foo" which contains context "12". Should the user context configuration should the configuration for context "12"?
  • How can a client stop collecting network bodies? If I set a configuration for context 12, which command can we use to stop it?

While it does simplify the API, it feels like a step back closer to what we previously had for subscriptions. A model where we create unique collectors that can each be removed on their own feels less surprising?

@OrKoN Let me know what you think, maybe you have suggestions on how a single setBodyCollectorConfiguration could fit those scenarios?

@OrKoN
Copy link
Contributor

OrKoN commented Feb 18, 2025

Thanks for summarizing. I am still not sure if we have a client with a use case right now for limiting response storage based on specific attributes of the request/response. I see that Playwright's model for Firefox is also based on the total size with eviction (I could not tell if it is per navigable or global?). Therefore, I think it would be a reasonable model to say that as a client you get last X bytes of response data stored per navigable that you enabled the collection for? Eventually if there are users requesting fine-grained control on the per-request basis, it could be added on top of that model.

As for how the configuration command should work I would say, unlike event subscriptions, we could make it so that the last command always win.

setBodyCollectorConfiguration(maxCacheSize) # sets maxCacheSizePerNavigable for all navigables in all user contexts, new navigables inherit from the session
setBodyCollectorConfiguration(maxCacheSize, userContexts) # sets maxCacheSizePerNavigable for all navigables in specified userContexts, new navigables inherit from the specified user context if they are created in it and from the session otherwise.
setBodyCollectorConfiguration(maxCacheSize, browsingContexts) # sets maxCacheSizePerNavigable for specified browsingContexts only

basically, at any time the session, each user context, each browsing context have a maxCacheSizePerNavigable value that is either a result of a configuration call or inherited from the "parent" object if a navigable/user context is newly created. So to stop collecting any responses the client could send setBodyCollectorConfiguration(maxCacheSize=0). I do not currently see that we would need the same mechanism for undoing configuration calls call-by-call as we have for event subscriptions so indeed it would be similar to setCacheBehavior.

So it sounds to be that maxCacheSize for the entire session would not be that useful but maxCacheSizePerNavigable as described would be fine without fine-grained per request controls?

On Firefox side I don't think we can handle duplicating huge responses in the parent process for BiDi, we will have to implement a cap anyway.

I wonder if you would still need to duplicate the responses if the removal happens at the points proposed in this PR (responses do not outlive the navigable)?

@juliandescottes
Copy link
Contributor Author

juliandescottes commented Feb 18, 2025

Right, we can set the total size to 0. It feels a bit like a workaround? Maybe having a clean command to really stop collecting bodies wouldn't hurt?

About making the max size a per navigable configuration. It makes it easier to work with multiple calls to setConfiguration.

setConfiguration(maxCacheSize=1000)
setConfiguration(userContexts=["foo"], maxCacheSize=5000)
setConfiguration(contexts=["12"], maxCacheSize=10000)

In that case, by default navigables have 1000 allowed cache size, the ones in "foo" have 5000 and navigable 12 has 10000.

It does mean that there's no effective max cache size anymore though. User may create new navigables and fill the browser memory. I imagine that's not a concern in practice, but it's important to note that we can't keep this under control with this approach. At least in this model we don't have to wonder if a request takes up space in the cache configured for its navigable or globally, the cache size is always allocated per navigable, and that seems nice.

I wonder if you would still need to duplicate the responses if the removal happens at the points proposed in this PR (responses do not outlive the navigable)?

Not really, it would require too many changes to our network events monitoring, which is almost entirely handled in the parent process for devtools/bidi. Also we should keep the door open to relax those limitations in the future, it would be great if responses could only be evicted when the top level traversable navigates / or is destroyed.

Which means I would still like to keep this configurable. Worst case scenario this could be driven by a preference + NOTE that implementations might truncate long response bodies, but I would really prefer having something consistent across browsers here.

@juliandescottes
Copy link
Contributor Author

Sidenote: I notice that CDP supports maxTotalBufferSize/maxResourceBufferSize, so unless I'm mistaken you already should have support for a per resource limit on CDP side?

@OrKoN
Copy link
Contributor

OrKoN commented Feb 20, 2025

Sidenote: I notice that CDP supports maxTotalBufferSize/maxResourceBufferSize, so unless I'm mistaken you already should have support for a per resource limit on CDP side?

indeed, in Puppeteer we have not used it so far though. In issues where people want increased limits they usually set it as high as the total size available so I am not sure how useful it is to guess how large individual responses could be.

@juliandescottes
Copy link
Contributor Author

Before reviewing the PR in details - I'm sure there are still syntax mistakes not worth fixing for now - let's summarize the current state, and get feedback on the overall approach.

Session changes:

  • session has navigable network collector configurations (map), user context network collector configurations (map) and global network configuration which store the various configurations clients can set for collecting bodies.
  • said configurations contain two numbers: max total size and max resource size
  • session has a list of collected responses, which contain (navigable id, navigation, request id and response). It's a list because ordering matters for eviction.

New command:

  • setBodyCollectorConfiguration(userContexts, contexts, maxTotalBodySize, maxResourceBodySize). Similar to setCacheBehavior in the sense that you need be careful with the order in which you call the API. Calling it globally erases individual configurations set for userContext/contexts, calling it for userContexts erases the configuration for contexts etc... There's no explicit way to completely stop collecting bodies, you need to set sizes to 0.

Updates to existing events:

  • When a response is caught in network.responseCompleted, we attempt to collect the body if it is related to a navigable:
    • If a collector configuration is set, then a collected response struct will be added to the collected responses list.
    • But it will only preserve the actual response if it matches the limits maxTotalBodySize/maxTotalBodySize.
    • Then we calculate the remaining size available for the navigable based on already collected responses for this navigable, and evict the first one until there is enough room available. (the algorithm is really not efficient, but I was trying not to go into too many details at the spec level, implementations can and should handle this differently).
    • The collected response will contain the navigable id as well as the navigable's ongoing navigation if available
  • On navigation committed we remove the bodies of all responses linked to this navigable, unless it has the same navigation id as the one provided to navigation committed.
  • On context destroyed we also remove the bodies of all responses linked to this navigable

@juliandescottes juliandescottes force-pushed the pr-856 branch 3 times, most recently from a677a1c to bec5f96 Compare February 21, 2025 09:06
@juliandescottes
Copy link
Contributor Author

@OrKoN In the last update I tried to simplify the API to only keep one method as suggested. While this works, I'm not sure this is really a good decision at the spec level.

It's functionally very close to what we had with the previous proposal, but is less flexible and more sensitive to the order in which commands are called. With the previous approach we have something that can naturally evolve to support url patterns and more fine grained configurations.

If libraries such as puppeteer only prefer to expose it as a simplified API, it should still be possible. Could we reconsider ?

@OrKoN
Copy link
Contributor

OrKoN commented Feb 21, 2025

@OrKoN In the last update I tried to simplify the API to only keep one method as suggested. While this works, I'm not sure this is really a good decision at the spec level.

It's functionally very close to what we had with the previous proposal, but is less flexible and more sensitive to the order in which commands are called. With the previous approach we have something that can naturally evolve to support url patterns and more fine grained configurations.

If libraries such as puppeteer only prefer to expose it as a simplified API, it should still be possible. Could we reconsider ?

I think in the previous proposal there was also a configuration method for limits and an additional per URL configuration methods. Could you please clarify how the current proposal would limit the addition of the per URL configuration methods?

@jgraham
Copy link
Member

jgraham commented Feb 21, 2025

I agree with @juliandescottes here; I feel like in this proposal the obvious things that a user might want to do (enable/disable collecting response bodies for some tab or user context) are exposed as side effects of configuring low-level details (cache sizes).

I do think we need that level of configuration, but I'd prefer an API where the methods correspond to user intent, and where we can have reasonable defaults for the various tuning parameters.

@juliandescottes
Copy link
Contributor Author

@OrKoN In the last update I tried to simplify the API to only keep one method as suggested. While this works, I'm not sure this is really a good decision at the spec level.
It's functionally very close to what we had with the previous proposal, but is less flexible and more sensitive to the order in which commands are called. With the previous approach we have something that can naturally evolve to support url patterns and more fine grained configurations.
If libraries such as puppeteer only prefer to expose it as a simplified API, it should still be possible. Could we reconsider ?

I think in the previous proposal there was also a configuration method for limits and an additional per URL configuration methods.

In the previous approach you had one method to set a global configuration (only resource max size, but can easily add total max size as well). Then add/removeBodyCollector was used to select in which contexts/userContexts user wanted to collect bodies, with an optional urlPattern (which can still be dropped in a first iteration).

Could you please clarify how the current proposal would limit the addition of the per URL configuration methods?

I find the current proposal harder to understand as is. You need to be aware that the order in which you call the command is important, and you might erase configurations unexpectedly. But it still remains relatively easy to predict how it's going to work without reading the spec.

Now if we add url patterns, there are a few things to answer. If context 12 is listening www.a.com, and I want to also listen to www.b.com, how can I do it? When we set the configuration again for this context, does it add to the existing pattern? Does it override it?

Then if we imagine we catch all requests globally with cache size of 1000. And for context 12, we only cache requests to JS, but with a cache size of 2000. If there's a request in context 12 which is not JS and doesn't match, then does it still get captured because we capture all requests globally? If then which cache size should be used?

We can answer to all those questions in the spec, but I'm still concerned it will make the behavior unexpected, whereas the API where you add and remove collectors is very simple to understand.

@OrKoN
Copy link
Contributor

OrKoN commented Feb 21, 2025

I can see a concern but I am not sure it's worse than the behavior of network.setCacheBehavior. I'd say the latest version aligns more with network.setCacheBehavior.

In the previous approach you had one method to set a global configuration (only resource max size, but can easily add total max size as well). Then add/removeBodyCollector was used to select in which contexts/userContexts user wanted to collect bodies, with an optional urlPattern (which can still be dropped in a first iteration).

should resource max size and max total size per navigable be part of the add/removeBodyCollector collector methods?

@OrKoN
Copy link
Contributor

OrKoN commented Feb 21, 2025

should resource max size and max total size per navigable be part of the add/removeBodyCollector collector methods?

if these settings are not part of the add/removeBodyCollector methods, then changing this limits via the global configuration is similar to this proposal in the sense that it would remove/add things from the cache otherwise handled by the body collector.

@OrKoN
Copy link
Contributor

OrKoN commented Feb 21, 2025

I agree with @juliandescottes here; I feel like in this proposal the obvious things that a user might want to do (enable/disable collecting response bodies for some tab or user context) are exposed as side effects of configuring low-level details (cache sizes).

would changing the current version's configuration to accept a cacheBehavior: "store" / "do-not-store" and limits being made optional address this concern?

@jgraham
Copy link
Member

jgraham commented Feb 21, 2025

So, I think an API like:

network.CollectResponseBodies = {
  enabled: bool,
  type: "text", ; Eventually allow "stream"
  ? contexts: [+browsingContext.BrowsingContext],
  ? userContexts: [+browser.UserContexts],
  ? cacheSize: js-int .default 10240
  ? maximumBodySize: js-int .default 1024
}

where cacheSize is per top-level traversable is in-principle OK just from the point of view of "does it map the obvious user operations to specific methods / parameters".

However as has been discussed, the limitation of this kind of design is that the only sensible update behaviour is an overwrite; basically each top level traversable gets a single response body configuration that's inherited from its user context at creation, and subsequent commands overwrite the configuration (purging the cache as necessary to match).

For network.setCacheBehavior that's basically enforced; the network cache is a browser level thing that we don't have much control over.

In this case it's a WebDriver-layer feature, so arguably there's value in looking at how we've designed other similar WebDriver features. In particular it seems very similar to network.addIntercept. That kind of design feels like it works pretty well for everything except perhaps total cache size:

network.BodyPhase = "request" / "response"

network.AddBodyCollector = {
  phases: [+network.BodyPhase]
  type: "text",
  ? contexts: [+browsingContext.BrowsingContext],
  ? userContexts: [+browser.UserContexts],
  ? maximumBodySize: js-int .default 1024
}

network.RemoveBodyCollector = {
  collector: network.CollectorId
}

That does match the way that we've tried to vend handles to things rather than just mutate global state, so that code can only remove what it created. For example a test fixture would be able to enable/disable collecting response bodies without having to worry about whether it had also been enabled by some other component. That significantly simplifies client code.

I think there's a complex version of this where we try to only provide a body if it matched the limits of a certain intercept i.e. in network.getResponseBody you have to provide the intercept id, and it only returns the body if it's part of that configuration. But that seems like it could be overkill; even with just size the configuration can be regarded as additive: at least the bodies matching your configuration will be available, if there was other configuration other bodies may also be available. In which case we could make a cacheSize parameter work pretty simply: the cache is sized at the maximum of any configured size.

FWIW, when I started writing this I was thinking that network.CollectResponseBodies could be OK. But the more I thought about why we've moved every other configuration API (events, preload scripts, intercepts) to give out handles rather than requiring knowledge of global state, the more I was convinced that the exact same concerns apply here, and we should favour that model.

@OrKoN
Copy link
Contributor

OrKoN commented Feb 21, 2025

So if multiple AddBodyCollector are made to the objects (contexts) which one applies? I think with events and intercepts the situation is different because no matter how many you have the result is a boolean outcome (is event emitted, is request intercepted). I think AddBodyCollector is fine as long as the resolution of the effective configuration happens at the AddBodyCollector call time and does not need to be re-computed per request. WDYT?

@OrKoN
Copy link
Contributor

OrKoN commented Feb 21, 2025

So if multiple AddBodyCollector are made to the objects (contexts) which one applies? I think with events and intercepts the situation is different because no matter how many you have the result is a boolean outcome (is event emitted, is request intercepted). I think AddBodyCollector is fine as long as the resolution of the effective configuration happens at the AddBodyCollector call time and does not need to be re-computed per request. WDYT?

Perhaps a solution could be that if there are any added body collectors that the request matches, then the body is retained but I am worried that the network overhead would grow linearly with the size of those collectors.

@juliandescottes
Copy link
Contributor Author

So if multiple AddBodyCollector are made to the objects (contexts) which one applies? I think with events and intercepts the situation is different because no matter how many you have the result is a boolean outcome (is event emitted, is request intercepted). I think AddBodyCollector is fine as long as the resolution of the effective configuration happens at the AddBodyCollector call time and does not need to be re-computed per request. WDYT?

My initial proposal was the same, a body collector would just tell you if yes or no the body should be persisted. In James' proposal there is a slight change to also make the max resource size part of this configuration (but not the total size). I think the suggestion on how to handle this in case several are matching is to pickup the max value from all matching collectors:

even with just size the configuration can be regarded as additive: at least the bodies matching your configuration will be available, if there was other configuration other bodies may also be available. In which case we could make a cacheSize parameter work pretty simply: the cache is sized at the maximum of any configured size.

Personally I would slightly prefer to have all those size settings as global , but that can work too.

Perhaps a solution could be that if there are any added body collectors that the request matches, then the body is retained but I am worried that the network overhead would grow linearly with the size of those collectors.

Yeah I think this matches what James was suggesting. I don't quite understand how this would lead to more overhead than another proposal though?

@OrKoN
Copy link
Contributor

OrKoN commented Apr 11, 2025

I think we can avoid requiring another ID to read request data (if the client knows the request ID and it's in the cache we should return it).

I think requiring a filterId to remove data from the cache would be fine but it should also be fine for the agent to release oldest (by insertion) requests even if there are active filters (I think this is the behavior of the current PR version). By the "APIs for removing items" do we mean only the explicit disownItem or also the removal of a filter? I think it would be nice if the removal of the filters didn't affect already cached data but I think we can connect removal of the filter with the disposal of related data.

@juliandescottes
Copy link
Contributor Author

By the "APIs for removing items" do we mean only the explicit disownItem or also the removal of a filter?

To be discussed. My idea was to have an explicit command to release/disown a body from a specific collector/filter, and also to remove bodies when all their related collector/filters are gone (either because they've been disowned, or because the collector/filters have been removed).

Basically the body internally keeps a list of collector/filter ids. When it is disowned for a given collector/filter id, the id is removed from this list. When a collector/filter is removed, all bodies are updated to remove the corresponding collector/filter id from their list. When the list of collector/filter ids for a body is empty, the body is deleted.

I think it would be nice if the removal of the filters didn't affect already cached data but I think we can connect removal of the filter with the disposal of related data.

Does this mean you're fine with both options, but would prefer that removing a collector/filter never removes bodies?

If the collector/filter id is optional, what should be the behavior of the command then? Imagine a body, which matched 2 collector/filter ids (assuming the command is called disown, let's settle naming later):

  • if disown is called without any collector/filter id, the body should be removed unconditionally
  • if disown is called with a collector/filter id which doesn't match, nothing should happen/an error should be raised
  • if disown is called with a matching collector/filter id ... should it just remove the collector/filter id from the list of matching collectors for this body, or should it fully remove the body. I imagine the latter? I think this makes the argument a bit irrelevant, it just acts as a guard to avoid reading/deleting bodies which didn't match your collector, but you will still potentially be deleting bodies still relevant for other collectors

should also be fine for the agent to release oldest (by insertion) requests

Do you mean a command to remove N old bodies starting from the oldest? Clients don't really know which bodies are stored or have been evicted already, do you have a scenario in mind where this could be useful?

@OrKoN
Copy link
Contributor

OrKoN commented Apr 14, 2025

I meant that we need to make the filter/collector id optional when retrieving the body. Having it required when the filter is removed or a body is disowned sounds fine if we basically want to count references to the cached bodies.

Do you mean a command to remove N old bodies starting from the oldest? Clients don't really know which bodies are stored or have been evicted already, do you have a scenario in mind where this could be useful?

I meant that the spec should still say that the user agent/browser/remote end removes oldest items if the limit is exceeded even if they have matching filters/collectors.

@jgraham
Copy link
Member

jgraham commented Apr 14, 2025

I meant that the spec should still say that the user agent/browser/remote end removes oldest items if the limit is exceeded even if they have matching filters/collectors.

This seems fine, but the fact that it's a question seems like maybe there's still some misunderstanding. Clearly the cache has to have some limited size, and I think we all agree that FIFO semantics are fine for that. However in addition the proposed design allows you to remove items from the cache early by calling disown on the filter ids that matched them, or by removing all the filters that matched them. In those cases eviction is (quasi) synchronous, so the question of how we handle size limits never comes up.

One scenario that isn't covered by this design is "I have a number of responses matching some filter, but after point I want to stop collecting new responses, but still have the browser responsible for caching the data until I retrieve it". Arguably that's a bad pattern (because you can't really trust the browser cache), but I can imagine a scenario where you want to store one response of a certain kind but then not process it to the end of a test or similar. Obviously one can solve this by having an on/off switch for filters. I'm not suggesting that for the first version, but it is a scenario that isn't well handled.

@juliandescottes
Copy link
Contributor Author

While updating the PR, I wonder if we should still attempt to remove collected network bodies for a given navigable on navigation? I feel like with a single remote end cache it doesn't make sense anymore.

Imagine two tabs, one with a single page app which never navigates, the other which regularly navigates. I think the collected bodies should just reflect the order in which responses have been captured, there's no reason to remove the ones for the second tab for each and every navigation.

What about when a navigable is destroyed? Similarly, I feel like we should not remove the bodies since we have a single remote end cache.

In other words the response bodies should no longer be tied to the lifecycle of the context/navigable where they were captured. Let me know what you think @OrKoN @jgraham

@OrKoN
Copy link
Contributor

OrKoN commented Apr 15, 2025

I agree that we should not now tie the lifetime of the cache to the lifecycle of a navigable.

@juliandescottes
Copy link
Contributor Author

I haven't renamed collector to filter yet, please take a look at the new version @OrKoN @jgraham first.
I'm not convinced filter is a better fit than collector, but mostly I wanted to limit the amount of renaming for now :)

@OrKoN
Copy link
Contributor

OrKoN commented Apr 23, 2025

Update: I continue to review but need to discuss with the team so expect a review next week.

P.S. sorry for slowness.


1. Let |already collected bodies| be an empty list.

1. For each |collected body| in [=collected network bodies=] in order:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the algorithm ever removing bodies from [=collected network bodies=]? I would expect empty records to be removed from the list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I'm only removing the data, so that the getNetworkBody command can throw either no such network body or unavailable network body. If I fully removed the items, an evicted network body would have to be handled in the same way as a body which was never collected or an invalid request id.

I feel like it's nice to provide different errors here? But it can be implemented differently of course.


1. Let |processBodyError| be this step: Do nothing.

1. [=Fully read=] |collected body|’s <code>cloned body</code> given |processBody| and |processBodyError|.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: the body would be the raw network body before i.e. content decoding? or is the fetch spec doing the decoding for us?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question.

Based on https://fetch.spec.whatwg.org/#http-network-fetch

Step 16.1.1

The incoming bytes seem to be decoded on the fly via https://fetch.spec.whatwg.org/#handle-content-codings and added to buffer which is used by the pullAlgorithm for the response body's stream. So I think fetch is doing the decoding for us here, and reading the stream should directly give access to decoded content?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if that is what we want for the use cases such as HAR generation? Would we have a content encoding set to something like gzip and the body to actually not be gzipped anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is an issue for har. Gzipped content that can be decoded as text is usually just saved as decoded text in HAR files. And other resources such as images will typically be base64 encoded.

See HAR "spec":

Before setting the text field, the HTTP response is decoded (decompressed & unchunked), than trans-coded from its original character set into UTF-8. Additionally, it can be encoded using e.g. base64. Ideally, the application should be able to unencode a base64 blob and get a byte-for-byte identical resource to what the browser operated on.

Encoding field is useful for including binary responses (e.g. images) into the HAR file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this on the side, and implementations should be allowed to store responses encoded or decoded, but we will still hook into fetch's stream to read the decoded response from a spec point of view.

index.bs Outdated
@@ -7708,6 +8246,8 @@ completed</dfn> steps given |request| and |response|:
[=request/client=]. Otherwise let |related navigables| be an empty
set.

1. [=Maybe collect network response body=] with and |request|.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@juliandescottes could you remind me if this is per-redirect hop or is it the final response for the redirect chain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, that would actually be once per redirect hop, which means we can't use the request id as the only identifier to store and get bodies.

I guess one option would be to skip recording this for intermediary redirects, but we won't be able to do this for request bodies... So I should add redirect to network-body and update getNetworkBody to also expect the redirect.

We should maybe go back to the discussion at #722 (comment) to introduce a single id rather than having to build a composite key from request id + redirect counter + authentication counter ... For request bodies for instance, not having the authentication attempt counter might also be a blocker, as we might have several distinct request bodies but without any way to differentiate them ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess one option would be to skip recording this for intermediary redirects, but we won't be able to do this for request bodies...

Why would not we be able to do this for request bodies? I think HTTP redirects would not have request bodies?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess one option would be to skip recording this for intermediary redirects, but we won't be able to do this for request bodies...

Why would not we be able to do this for request bodies? I think HTTP redirects would not have request bodies?

I assume we would record in beforeRequestSent, so we won't have a way to know there if the request will be a redirect or not? of course we could just not do anything and let the latest request/response body for a given request id overwrite whatever came before?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so for requests, we probably want the initial requests's body and for the response we want the latest response body? otherwise, we need to solve the http redirect + auth identification problem

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so sounds like we can overwrite for responses?

For redirects we could still have a scenario where the first request and response have been received, but the response for the redirect has not been created yet. So at that point in time, clients might be able to get the response body for the original request. But if they called it a bit later, they would get the redirect's response?

So maybe we should actively skip collecting bodies for redirect statuses? And I'm now thinking it would be great if getNetworkBody would wait for the response body to be collected before returning, as long as the response has already been cloned?

For auth we only send one responseCompleted event, so we would only collect the body once.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that all makes sense. For redirects we should always know it's a redirect from the headers before we have any of the body, so as you say we should be able to skip collecting the body entirely.

Agreed that the command should return once the full body is available. If people are doing something like long-polling they'll need to wait for streaming support.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be fine to store the original request body and the final response only and make it explicit in the spec since I am not sure how accessible redirect bodies are in our implementation. It sounds like if that becomes a problem we can add an additional parameter to identify the redirects when fetching bodies?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jgraham

Just want to make sure I understand your sentence correctly

Agreed that the command should return once the full body is available. If people are doing something like long-polling they'll need to wait for streaming support.

My current spec PR throws an error if the network body has not been collected yet when getNetworkBody is called. In my mind this was fine because it was tied to responseCompleted, and I figured clients would wait for responseCompleted before calling getNetworkBody.

But for redirects it's more tricky because there will be several responseCompleted events. So my proposal was to update getNetworkBody to explicitly wait for the body to be collected, as long as the request id & network body type are valid. This would allow clients to call getNetworkBody without waiting for responseCompleted, and also to call it at anytime during a chain of redirects.

So I think you agree we should do this and implement a mechanism to wait on the body in the getNetworkBody command?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants