Skip to content

Return a content-encoding header for resource timing and more #1796

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
May 28, 2025

Conversation

guohuideng2024
Copy link
Contributor

@guohuideng2024 guohuideng2024 commented Dec 13, 2024

The major change is to add content-encoding to response header list. This PR also adds description on how content-encoding is determined. (content negotiation)

The purpose is to pass such value to resource timing. Further details are available at
w3c/resource-timing#381.

Note: Per discussion at 12/05/2024 webPerWG call (https://docs.google.com/document/d/1mpFDrAWuV6IgvJ1KiL9sgIlcboC5uArtF8r_oqS1Sco/edit?tab=t.0#heading=h.af6v74wysf4m), we decided to allow arbitrary "content-encoding" value at "fetch". We only filter such value at client side, before passing the value to resource timing.

Related PR to modify resource timing specification:
w3c/resource-timing#411

(See WHATWG Working Mode: Changes for more details.)

Bug: w3c/resource-timing#381


Preview | Diff

@annevk
Copy link
Member

annevk commented Jan 7, 2025

Thanks for taking the time to pick this up. However, it doesn't seem like this addresses all the issues with #1742? I recommend studying the feedback on that PR.

@guohuideng2024
Copy link
Contributor Author

Thanks for taking the time to pick this up. However, it doesn't seem like this addresses all the issues with #1742? I recommend studying the feedback on that PR.

Hi Anne! I think I should have put up some background information here.

  1. You mentioned in Pass in Content-Encoding to resource-timing #1742 that the spec must define how the value is determined. This PR is trying to do that. The value is a result of the "content negotiation" (determine what encoding should be used) so I tried to add that into the existing text.
    Note that Pass in Content-Encoding to resource-timing #1742 is a change similar to one for a previously added field contentType. But contentEncoding is very different, it's not an extracted MIME type, but a result of "content negotiation". So, this PR should be very different from Pass in Content-Encoding to resource-timing #1742

  2. We originally thought that the filtering should happen at the "fetch" stage. But in the last web perf meeting Patrick brought up that the returned contentEncoding can be a proprietary value and that value is needed by service worker. So, the unfiltered value must be kept by the browser and the filtering should happen right before reported to resourceTiming.

https://docs.google.com/document/d/1mpFDrAWuV6IgvJ1KiL9sgIlcboC5uArtF8r_oqS1Sco/edit?tab=t.0#heading=h.af6v74wysf4m

Therefore, in this fetch doc I didn't mention filtering. I mentioned "filtering" in the resourceTiming spec:
w3c/resource-timing#411
And I am going to add more details about the filtering there.

Does this sound right to you? I am new to fetch and I may have missed a lot of things here. Thanks for your patience and guidance.
Guohui

chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this pull request Jan 16, 2025
This CL introduce a contentEncoding field to Performance resource timing
object. This field is behind a feature flag.

PR to resource timing specification:
w3c/resource-timing#411
PR to fetch specification:
whatwg/fetch#1796

Bug: 327941462
Change-Id: I70cad190fe658fb3dbf8b401ff8393bc1d0782f0
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this pull request Jan 16, 2025
This CL introduce a contentEncoding field to Performance resource timing
object. This field is behind a feature flag.

PR to resource timing specification:
w3c/resource-timing#411
PR to fetch specification:
whatwg/fetch#1796

Bug: 327941462
Change-Id: I70cad190fe658fb3dbf8b401ff8393bc1d0782f0
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/6098321
Commit-Queue: Guohui Deng <[email protected]>
Reviewed-by: Noam Rosenthal <[email protected]>
Reviewed-by: Matthew Denton <[email protected]>
Reviewed-by: Yoav Weiss (@Shopify) <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1407331}
aarongable pushed a commit to chromium/chromium that referenced this pull request Jan 16, 2025
This CL introduce a contentEncoding field to Performance resource timing
object. This field is behind a feature flag.

PR to resource timing specification:
w3c/resource-timing#411
PR to fetch specification:
whatwg/fetch#1796

Bug: 327941462
Change-Id: I70cad190fe658fb3dbf8b401ff8393bc1d0782f0
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/6098321
Commit-Queue: Guohui Deng <[email protected]>
Reviewed-by: Noam Rosenthal <[email protected]>
Reviewed-by: Matthew Denton <[email protected]>
Reviewed-by: Yoav Weiss (@Shopify) <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1407331}
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this pull request Jan 16, 2025
This CL introduce a contentEncoding field to Performance resource timing
object. This field is behind a feature flag.

PR to resource timing specification:
w3c/resource-timing#411
PR to fetch specification:
whatwg/fetch#1796

Bug: 327941462
Change-Id: I70cad190fe658fb3dbf8b401ff8393bc1d0782f0
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/6098321
Commit-Queue: Guohui Deng <[email protected]>
Reviewed-by: Noam Rosenthal <[email protected]>
Reviewed-by: Matthew Denton <[email protected]>
Reviewed-by: Yoav Weiss (@Shopify) <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1407331}
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this pull request Jan 21, 2025
…ourceTiming, a=testonly

Automatic update from web-platform-tests
Expose contentEncoding in PerformanceResourceTiming

This CL introduce a contentEncoding field to Performance resource timing
object. This field is behind a feature flag.

PR to resource timing specification:
w3c/resource-timing#411
PR to fetch specification:
whatwg/fetch#1796

Bug: 327941462
Change-Id: I70cad190fe658fb3dbf8b401ff8393bc1d0782f0
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/6098321
Commit-Queue: Guohui Deng <[email protected]>
Reviewed-by: Noam Rosenthal <[email protected]>
Reviewed-by: Matthew Denton <[email protected]>
Reviewed-by: Yoav Weiss (@Shopify) <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1407331}

--

wpt-commits: 1df2c3e47bcb6379ecf3a07735bd967101d02a5b
wpt-pr: 50115
guohuideng2024 and others added 2 commits January 21, 2025 15:54
1) formatting;
2) "gzip, GZIP" is ok for they case-insensitive match.
3) there is a mistake saying that the "contentEncoding" consists of
digits;
4) no longer returns "contentEncoding" for data url.
That's on the client side getting the reponse header.
Just add the content encoding to body info.
@guohuideng2024
Copy link
Contributor Author

Updated the patch, I just added the content encoding to the body info struct, and add the clause that updates it.

@guohuideng2024
Copy link
Contributor Author

very sorry for so many mistakes folks. Thanks for you guys' patence.

@noamr
Copy link
Contributor

noamr commented Jan 23, 2025

very sorry for so many mistakes folks. Thanks for you guys' patence.

No worries, we've all been there! (Or at least I have...)

i3roly pushed a commit to i3roly/firefox-dynasty that referenced this pull request Jan 24, 2025
…ourceTiming, a=testonly

Automatic update from web-platform-tests
Expose contentEncoding in PerformanceResourceTiming

This CL introduce a contentEncoding field to Performance resource timing
object. This field is behind a feature flag.

PR to resource timing specification:
w3c/resource-timing#411
PR to fetch specification:
whatwg/fetch#1796

Bug: 327941462
Change-Id: I70cad190fe658fb3dbf8b401ff8393bc1d0782f0
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/6098321
Commit-Queue: Guohui Deng <[email protected]>
Reviewed-by: Noam Rosenthal <[email protected]>
Reviewed-by: Matthew Denton <[email protected]>
Reviewed-by: Yoav Weiss (@Shopify) <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1407331}

--

wpt-commits: 1df2c3e47bcb6379ecf3a07735bd967101d02a5b
wpt-pr: 50115
@guohuideng2024
Copy link
Contributor Author

I think it should be specified here as that matches how we do MIME types and that reduces the chances of someone inadvertently exposing the information. In other words: the guarantee should come from Fetch, not from the caller.

The "raw" contentEncoding value can be arbitrary proprietary compression the app uses, and it's leaked as a response header.
So it's indeed a new communication channel that's created :(.
Meanwhile I think moving the filtering here guarantees that the only place where the raw contentEncoding is leaked is the fetch response header. I would say something here that contentEncoding is filtered when accessed anywhere else.
If there is any concern pls let me know. Thanks.

Specifically, it needs to be explicitly filtered when assigned to the response body into struct.

Got it, Thanks! I updated the PR accordingly.

@guohuideng2024
Copy link
Contributor Author

To be clear, the header is not exposed to the website passively embedding the resource, but this getter is. I don't think I understand your suggestion, could you rephrase?

I think the website can get the arbitrary value like this: (I am new to this area so please correct me if I am wrong)

let myCoding = myHeaders.get("Content-Encoding");  //  |myCoding| can be a proprietary compression, i.e., an arbitrary value.

And the reason for that is some use cases involving service workers. See
w3c/resource-timing#381

but the contentEncoding field in resourceTiming is filtered, where only a few pre-determined values are permitted.

@noamr
Copy link
Contributor

noamr commented Mar 11, 2025

To be clear, the header is not exposed to the website passively embedding the resource, but this getter is. I don't think I understand your suggestion, could you rephrase?

I think the website can get the arbitrary value like this: (I am new to this area so please correct me if I am wrong)

let myCoding = myHeaders.get("Content-Encoding");  //  |myCoding| can be a proprietary compression, i.e., an arbitrary value.

You would only get access to myHeaders if this is an actual fetch or via a service worker; Those channels are not always available.

And the reason for that is some use cases involving service workers. See w3c/resource-timing#381

but the contentEncoding field in resourceTiming is filtered, where only a few pre-determined values are permitted.

Yea, so filtering them when assigning to the struct wouldn't change anything observable, but any future user of that struct would get the filtered value.

@guohuideng2024
Copy link
Contributor Author

Thank you Noam!

@annevk : Would you take one more look? (I also left a response at WebKit/standards-positions#467 )

Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed you're participating on behalf of Microsoft. That means you cannot sign the contributor's agreement as an individual. Microsoft has already signed up for the Fetch Workstream so you have to join the relevant GitHub organization (MicrosoftWHATWGContributors) and make your membership thereof public.

@annevk
Copy link
Member

annevk commented Apr 9, 2025

Now with the filtering in place it's probably slightly more reasonable to leave the existing parsing issue unsolved for now, assuming there's adequate test coverage.

@guohuideng2024
Copy link
Contributor Author

Thanks @annevk! I am working on MicrosoftWHATWGContributors membership right now.

@whatwg whatwg deleted a comment from guohuideng2024 Apr 16, 2025
1. remove internal reference because the external one is merged.
2. replace "unknown" with "_".
@guohuideng2024
Copy link
Contributor Author

update:

  1. Since I couldn't register a new reserved "unknown" value for "content coding", per the discussion with Roy and Yoav, I am replacing "unknown" with "". "" cannot be used as a "content coding" value so it won't cause value space problem.
  2. I removed the internal reference because the reference has been registered into "specref".

1. If codings is failure;
2. Remove duplicated reference.
@guohuideng2024
Copy link
Contributor Author

In the http spec I found the syntax restriction of the "content coding", which gives us a complete list of possibilities.
(see the end of this comment for more details)

@unknown looks good to me but I would like to confirm we don't want to use ?unknown nor [unknown] instead? @annevk

Hopefully nobody objects and I will update this PR with @unknown this Wednesday.

Cheers,
Guohui

+++++++++ details from http spec +++++++++++++++++++++++++++
http spec actually defines the syntax restriction of "content-coding" parameter.
Link to the http spec: https://www.rfc-editor.org/rfc/rfc2616
The "content-coding" is a "token", and the "token" is defined as the following:

token = 1*
separators = "(" | ")" | "<" | ">" | "@"
| "," | ";" | ":" | "" | <">
| "/" | "[" | "]" | "?" | "="
| "{" | "}" | SP | HT
// where:
CTL = <any US-ASCII control character
(octets 0 - 31) and DEL (127)>
SP = <US-ASCII SP, space (32)>
HT = <US-ASCII HT, horizontal-tab (9)>

@guohuideng2024
Copy link
Contributor Author

I updated the PR with @unknown since there is no objects after asking for potential objects.
The CL to chromium and WPT tests is also updated accordingly.

Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good, but wrapping at 100 columns doesn't seem to be done properly. I could address that, but you're welcome to it as well.

Also, #1796 (review) still needs to be resolved to satisfy the IPR bot.

@guohuideng2024
Copy link
Contributor Author

guohuideng2024 commented May 22, 2025

The change looks good, but wrapping at 100 columns doesn't seem to be done properly. I could address that, but you're welcome to it as well.

Also, #1796 (review) still needs to be resolved to satisfy the IPR bot.

[Edit] I fixed the 100 column width violation. :)

I joined MicrosoftWHATWGContributors organization. but I still fail the agreement check. I think the check failed because I submitted an agreement as an individual in the past. And it looks like I don't have permission to withdraw the individual agreement I submitted.

The error says:

@guohuideng2024 has signed up to participate as an individual, but has not yet been verified. 

And my name is still in the "individuals" list:
https://github.com/whatwg/participant-data/blob/main/individuals.json#L5565

Would you know how I can fix the agreement check failure? Thanks.

Column width of lines should be 100 or fewer unless
there is a good reason.
Copy link
Member

@annevk annevk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, this looks good. When do you expect the tests to be ready? We'll want to land those and this change around the same time.

@guohuideng2024
Copy link
Contributor Author

Great, this looks good. When do you expect the tests to be ready? We'll want to land those and this change around the same time.

I am going to get the changes to the tests landed next week. I will update here as soon as it's done.

@guohuideng2024
Copy link
Contributor Author

guohuideng2024 commented May 27, 2025

@annevk The tests are updated :)

web-platform-tests/wpt#52807

I greatly appreciate the patience and guidance I received here!

@annevk annevk merged commit fd9beaa into whatwg:main May 28, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants