-
Notifications
You must be signed in to change notification settings - Fork 341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cross-Origin Read Blocking (CORB) #681
Comments
cc @cdumez @youennf @travisleithead @evilpie @ckerschb @whatwg/security (please tell me or whatwg/meta if you want to be added to this team; it's basically for notification purposes of issues that need security input) |
Having read through https://chromium.googlesource.com/chromium/src/+/master/services/network/cross_origin_read_blocking_explainer.md in more detail I wonder why it doesn't call out |
Maybe @anforowicz or Nick Carter can chime in more, but in terms of persisting no-cors to disk, Chrome's implementation is still able to write the response to disk without giving it to the renderer process that made the request. (I think we're using DetachableResourceHandler for that, FWIW.) That may be worth mentioning in the explainer, since I think it matters for preload and ServiceWorkers as well. Were there other cases you were concerned about? |
@csreis storing the response from a (I'm not entirely sure where we should put the canonical description of the class of attacks, standards-wise. Either here or in HTML I suppose.) |
#144 (comment) - see "Attack 4". It looks like CORB will handle this attack for particular mime types, but I think it still makes sense to apply the extra blocking I proposed, since it'll cover all mime types. Let me know if that's wrong. |
I am a bit concerned about the "Mislabeled image (nosniff)" case. Do you have any data on how common text/html is for images, with nosniff? At least for JavaScript this number was quite high, and even higher in the HTTP Archive report. This number however doesn't take into account no-sniff. Do you have that data or maybe should we ask the HTTP Archive people again? |
@evilpie, we have some data in the "Quantifying CORB impact on existing websites" section of the explainer. After excluding responses that had an explicit "Content-Length: 0" response header, we see that 0.115% of all CORB-eligible responses might have been observably blocked due to a nosniff header or range request. The real question here is: how many of these 0.115% contained images (and were undesirably disrupted by CORB) VS non-images (and were non-decodable with and without CORB). At this point we only have anecdotal data - we were able to only repro one such case in the wild and it turned out to be a tracking pixel that returned a html doc as a response. |
I'm currently looking to enable range requests to pass through a service worker safely, and later I'll specify how various web APIs should make range requests and validate responses. Although CORB is involved in the same area, the goals are different, but we should be aware of overlap 😄. Here's a summary of the similarities and differences, as I understand them: CORB's goal is to prevent bringing data into the content process, whereas I'm aiming to prevent exposing data to script. CORB is best-effort, with compatibility in mind, whereas I need to strictly avoid exposing opaque data to script. CORB will filter opaque partial responses if they match particular content types. This prevents an audio/video element being used to bring data that's potentially sensitive into the content process. #560 prevents Attack 4, where a CORB recommends against multipart range requests. Currently range requests aren't specced from that API's point of view, but I'm trying to define it. I don't plan to use multiple ranges in a single response, and once specced, browsers shouldn't make kinds of range requests that aren't explicitly allowed. I intend to make media elements reject responses that would result in a mix of opaque and visible data being treated as the same media resource. This prevents Attack 1. I intend to make media elements reject responses that would result in opaque data from multiple URLs being treated as the same media resource. This prevents Attack 2. I intend to make range supporting APIs fail if the partial response starts at an offset other than the requested range. This prevents Attack 3. In intend to make downloads fail/restart if content identifying headers change between requests. Such as total length in |
Why does CORB blocking filter the response? Wouldn't it be more robust to replace the response with a generic empty response? Although they're less sensitive, CORS safelisted headers and status codes also leak data. |
I don't think the cache API is part of this. If responses are filtered/blocked as part of fetch, then only correctly filtered/blocked responses will go into the cache. |
Doesn't this depend on how Edit: Oh, you mean at the spec level. Nevermind. |
Regarding range requests: Thanks for covering the overlap here, @jakearchibald! I agree that CORB overlaps with the attack 4 defense, but only for certain content types, so your original plans still seem relevant.
Correct.
Correct.
Correct.
Just to clarify, are you saying that multipart range requests wouldn't generate responses with multipart/byteranges content types after your changes? In general or just for service worker? We would love it if the content type reflected what was in the response, since we weren't eager about parsing the multipart response in the browser process to determine what was in it. For now, we're just recommending against supporting it for sensitive data.
Sounds good to me (and orthogonal to CORB). |
Regarding service worker and cache API:
I don't think that's right. CORB shouldn't depend on what the destination of the request is, nor what service worker is going to do with the response. The intention is to not expose the data of the opaque response to the service worker in the renderer process at all. My understanding from discussions with @mattto, @anforowicz, and @nick-chromium was that service worker could still handle opaque responses without exposing that data to the renderer process, though it's worth clarifying the details on things like cache.add.
It might be possible for you to do the check in the renderer process, but that defeats some of the benefit of CORB. Hopefully we can make that unnecessary.
Again, I think the intention it to keep them out of process entirely, rather than for as long as possible. Otherwise an attacker could use the cache API to pull whatever they want into their process, correct? Maybe we can talk with @anforowicz, @nick-chromium, and @mattto about how we're handling it in Chrome.
To be clear, I think we're ok with having blocked responses end up in the cache in general, as long as they're not in the renderer process. (Sorry if I'm misunderstanding here.) |
I'm saying that the browser shouldn't ever ask for multiple ranges in a single request. Is there anywhere we do this today? Or am I misunderstanding what you mean by multipart request?
I think it's easiest if the body is replaced with an empty body long before it ends up in the cache API. I'm happy with these empty-body responses ending up in the cache. |
I met with @jakearchibald and @mattto this week to discuss the cache API and we agreed that CORB won't disrupt it, since the cache API is origin-specific. It's important to note that CORB doesn't take anything about the request into account, which means that if CORB blocks a response for a given origin, then it would be blocked no matter how that origin asked for it (even when retrieving it later from the cache API). Thus, it's fine for a ServiceWorker (or a page) to put an empty value for an opaque response into the cache API, since that response will always be opaque for that origin. (This is different from preload and the network cache, where we do want CORB-blocked responses to end up on disk, so that they're fast after navigating to a cross-origin page. That doesn't require sending the data to the renderer process, though.)
Given the above, hopefully the renderer process check is not necessary? |
Quick status update:
|
I thought that I'd also share a link to the middle of a I/O '18 session where CORB was discussed: https://youtu.be/yIaYQGPuZbM?t=2614 |
CORB is an additional filter for responses of cross-origin "no-cors" fetches. It aims to provide defense-in-depth protection for JSON, HTML, XML (though not image/svg+xml), and (sometimes) text/plain resources against cross-process CPU exploits. It also makes it harder to use incorrectly labeled resources as scripts, images, fonts, etc. Discussion and further work is tracked by #681 and #721. Tests are in web-platform-tests's fetch/corb directory.
This comment has been minimized.
This comment has been minimized.
@csreis @anforowicz what's the timeline for defining the remainder of CORB? Other browsers would like to implement it as well so it'd help if it was fully defined. |
@annevk, I think that the only part of CORB that still requires an official description is the sniffing algorithm that CORB uses to say with high confidence that the response really contains a html / xml / json document. This sniffing differs slightly from the sniffing algorithms in the mimesniff.spec, because of the need to avoid accidentally sniffing JavaScript (allowed in cross-origin responses) as HTML (blocked by CORB in cross-origin responses). Q: Is description of the sniffing algorithm the main/only blocker for implementing CORB in other browsers? AFAIR, I've tried to argue that even if differences in sniffing implementations would not be (*) observable by web contents (assuming that the sniffing correctly classified a response as html/xml/json only if the response really is html/xml/json and not one of cross-origin-allowed types like javascript or css). This led me to further argue that sniffing shouldn't be described in a normative part of a spec (but possibly still described in a non-normative spec section or in document). So - I think describing the Chromium's CORB sniffing algorithm in the CORB explainer might be a good first step here. WDYT? Q: WDYT? Where should the sniffing algorighm's description go (in the short term and in the long term)? (*) OTOH, maybe the presence of wpt/fetch/corb/script-html-js-polyglot.sub.html test is a counter-example here - incorrect sniffing can lead to observable/incorrect behavior that this test is supposed to catch. |
My understanding from Firefox is that a complete description of CORB would help, for implementation, for analysis, and for looking at potential further expansion. Having a non-normative description first would be a good first step. I wonder if https://mimesniff.spec.whatwg.org/ might be a good long term place. I see some potential for sharing there. E.g., if we detect a PDF, ZIP, or RAR resource we could also deny access straight away. And unless we expect CPU architecture to fix Spectre within the next five-ten years, I think we need a normative definition as well, as it defines the effective security boundary and it's good to be as clear and accurate about that as possible. |
This comment has been minimized.
This comment has been minimized.
I just ran into an issue with CORB using the fetch api. I understand the security implications of blocking third party extensions/javascript from reading sensitive mime types when coming in to the client's browser. My questions are:
Would appreciate any help in shining light on these |
I am not sure if I understand the scenario above (e.g. I don't understand what is meant by "shift the request server side" and "[have the server] send the response back to the client via the same origin"). Do you mean request initiated from https://bar.com to https://foo.com/secret.json and foo.com server redirecting to https://bar.com/secret.json? At any rate, CORB is a client-side security feature and it can't protect against information disclosure problems on the server side.
CORB only blocks responses that cannot possibly be included in <img> or <script> or similar legacy tags - therefore CORB would not block Javascript / js (unless it is served with a wrong Content-Type). CORB would block json and/or html, but these only make sense in responses to fetch/XHR (and so would also be blocked by CORS). If ad scripts depend on cross-origin responses then either the responses are allowed by CORS (and so are not blocked by CORB) or the ad scripts are already broken (because the responses are blocked by CORS even before CORB looks at them). |
To the first point, if a browser running on https://foo.com is making a cross origin request to https://bar.com, (and So I guess ad scripts/tags have set the Access-Control-Origin-Header on their side and therefore are not blocked by Corb or Cors? |
I don't know what "foo.com proxies that request to https://bar.com". If foo.com server trusts bar.com then it can share its data with bar.com (via ftp / http-or-rest / phone calls/etc.). However, because of CORB the browser won't share foo.com's data with bar.com.
If an ad script wants to read cross-origin data from foo.com, then foo.com (not the ad script) has to agree to giving the data to the ad (by sending back appropriate CORS headers in the http response). |
Sorry if i'm not explaining it correctly. Lets say you load My question is, what is the best practice to show the html returned from Or is there another best practice to achieve this? |
One way to embed a secure form from https://secure.com in a document from https://visa.com is by using iframes. |
As @anforowicz mentions, CORS and CORB apply to a document's subresource requests, but not to iframes. In your example, visa.com can load https://secure.com's text/html response in an iframe without being blocked by either CORS or CORB. It cannot use fetch or XHR to get https://secure.com's text/html response without an Access-Control-Allow-Origin header. Also, if foo.com tried to request the URL via an img or script tag, CORB would filter the response (though it wouldn't have been usable in those contexts anyway). As for the proxying question, foo.com could indeed proxy data from bar.com, but this isn't a security risk to the user because the request to bar.com won't have the user's cookies or other credentials if it's being made from foo.com's server. There's no need to use this proxying for iframes. (Most ads load in iframes, giving them access to whatever data they need from their own origin.) Hope that clarifies things. |
https://annevankesteren.nl/2015/02/same-origin-policy might help here. In particular, note that secure.com might only be available on the user's local network, so you couldn't proxy the request. |
I'm just an end-user but I wanted to inform you that your system has blocked a very simple link on a small-town news site, a link to police reports that would help end a debate that is fanning a fire surrounding a protest at a local police station. It may have nothing to do with you and it may seem like one case has nothing to do with the other, but in small towns like ours, a shooting and an elderly woman in jail ...well all roads (and all fingers) point back to the police station. https://www.koamnewsnow.com/additional-reports-detail-investigation-into-well-being-of-man-months-before-he-was-found-in-dead/ |
@mingcatsandra the issue you're experiencing is not related to CORB. And it also isn't a link, it's a simple image, which is nonexistant. Use the website's contact information to inform the owners if you want them to address an issue. |
Actually its a non-existent image NOW. Before the previous update of Google Chrome it was a very long, VERY detailed series of police reports that took me and my fiance an hour and a half to get through. My fiance is an ex-law enforcement officer for that city so he was able to help me understand the lingo. Sorry it took so long to get back to you but I was very thoroughly hacked directly after making my first comment here and have just regained access to my account. |
I suggest we close this when #1441 lands. |
Historically, browsers had rather lax Content-Type checking. We’ve been able to introduce stricter checks in some cases (e.g. blocking mislabeled scripts and stylesheets in presence of the nosniff header [1]) and unfortunately failed in some other cases (e.g. Firefox’s attempt to block mislabeled images in presence of the nosniff header [2, 3]).
Given Spectre, lax handling of mislabeled cross-origin responses carries new, significant security risks. We've developed a proposal, which we're calling Cross-Origin Read Blocking (CORB), which increases the strictness of cross-origin fetching semantics while trying to still stay web-compatible. CORB reduces the risk of leaking sensitive data by keeping it further from cross-origin web pages. In most browsers, it keeps such data out of untrusted script execution contexts. In browsers with Site Isolation, it can keep such data out of untrusted renderer processes entirely, helping even against speculative side channel attacks.
We're looking to collaborate with everyone on an interoperable set of changes to the web platform, so that blocking of cross-origin responses can be done consistently across all the browsers. Please take a look at the proposal and its compatibility impact in the CORB explainer and provide feedback in this thread on the algorithm itself, as well as on the next steps for trying to encode CORB into the relevant specs for web standards.
We believe that CORB has a reasonably low risk of breaking existing websites (see the “CORB and web compatibility” section in the explainer). We’ve spent a considerable amount of time trying to tweak CORB to minimize compatibility risk (e.g. introducing confirmation sniffing and skipping sniffing for HTML comments since JS can have them too) and are continuing to consider additional tweaks to minimize the risk further (e.g. we are trying to gather data that might inform how to handle text/plain and range requests). The remaining risk is mostly for nosniff responses labeled with a wrong MIME type - as pointed out above, stricter handling of such responses has always been desirable, but the Spectre threat makes this more urgent.
[1] https://fetch.spec.whatwg.org/#should-response-to-request-be-blocked-due-to-nosniff?
[2] #395
[3] https://bugzilla.mozilla.org/show_bug.cgi?id=1302539
The text was updated successfully, but these errors were encountered: