Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identifiers #629

Closed
annevk opened this issue May 24, 2018 · 17 comments
Closed

Identifiers #629

annevk opened this issue May 24, 2018 · 17 comments

Comments

@annevk
Copy link

annevk commented May 24, 2018

@mikewest suggested using structured headers for Cross-Origin-Resource-Policy in whatwg/fetch#733 which makes sense, but requiring origins to be embedded in strings seems rather awkward and doesn't have precedent with the existing headers.

@reschke
Copy link
Contributor

reschke commented May 24, 2018

In fact, several existing header fields use angle brackets as delimiters for URI (references) (Link, Dav, If). SH tries to get away with one delimiter syntax, so quoted-string it is.

@mikewest
Copy link
Member

Given how common URLs and origins seem to be as primitive components of headers headers, perhaps splitting them out into a type distinct from "string" might make sense. The initial serialisation might even be the same, but perhaps an eventual binary format could do more interesting tricks given knowledge of the format.

@annevk
Copy link
Author

annevk commented May 29, 2018

An alternative way of stating OP would be asking why identifiers are as restricted as they are in the proposal?

(I'm not entirely convinced either origin or URL is common enough to warrant its own type, FWIW.)

@reschke
Copy link
Contributor

reschke commented May 29, 2018

origin is a URI (reference), no?

It's quite common that certain delimiters are not allowed in identifiers. I assume we're discussing characters such as ":" and "/"?

@annevk
Copy link
Author

annevk commented May 29, 2018

If you parse an origin as a URL and then serialize it, it would not be equal to the original input (it would have a trailing slash). And an opaque origin serializes as null which can't parse as a URL.

As for what's under discussion: "/" is already allowed. So ":" and ".", mainly, I think, but I'd have to double check if that's all.

@reschke
Copy link
Contributor

reschke commented May 29, 2018

If you parse an origin as a URL and then serialize it

Depends on what you parse and serialize it with. FWIW, how does that matter in this context?

And an opaque origin serializes as null which can't parse as a URL.

It parses as URI reference.

As for what's under discussion: "/" is already allowed. So ":" and ".", mainly, I think, but I'd have to double check if that's all.

Hmm. https://greenbytes.de/tech/webdav/draft-ietf-httpbis-header-structure-04.html#identifier:

Identifiers are short (up to 256 characters) textual identifiers; their abstract model is identical to their expression in the textual HTTP serialisation.

identifier = lcalpha *255( lcalpha / DIGIT / "_" / "-"/ "*" / "/" )
lcalpha    = %x61-7A ; a-z

Note that identifiers can only contain lowercase letters.

So I don't see ":" and "." here.

@annevk
Copy link
Author

annevk commented May 29, 2018

Right, that's the issue.

@reschke
Copy link
Contributor

reschke commented May 29, 2018

Well, you'd need more characters to be allowed so that an origin always is an identifier, no? (for instance: ",")

@annevk
Copy link
Author

annevk commented May 29, 2018

Maybe, see whatwg/url#390. I agree that if we need to continue to allow "," in hosts we better stick to strings.

Even then I'm still interested in hearing the rationale behind the characters allowed in identifiers.

@mnot
Copy link
Member

mnot commented May 31, 2018

Identifiers aren't intended for payload; note that in the latest draft, they're only used for parameters and dictionaries. Everything else that's string-y needs to be quoted strings.

If we added an URL type, it would almost certainly be delimited with angle brackets, since DQUOTE is already taken, and it needs to be both signalled and delimited in some fashion.

It sounds like the underlying requirement here is "it has to look exactly like previous headers", which is a difficult bar to meet. What's the actual problem with using a string?

@annevk
Copy link
Author

annevk commented May 31, 2018

I guess that answers one question I had for Cross-Origin-Resource-Policy. Namely that using same-origin / same-site as byte-for-byte compared values would not be forward compatible with structured headers...

Which is also the problem. Using identifiers requires a very simple parser. And when we use origins we don't want to parse them as origins, but we want to compare them byte-for-byte. So all we'd need is split on "," and trim HTTP whitespace. With strings it's a little more involved and they would be harder to adopt now without all the structured header infrastructure in place. (And unfortunately as these things go everyone wants to ship stuff now.)

@mnot
Copy link
Member

mnot commented May 31, 2018

Shipping code that doesn't implement SH but specifying it as SH seems pretty bad, both for your spec and for SH interop.

Re-adding identifiers (or something like them) to the possible types in SH is certainly possible, though.

@annevk
Copy link
Author

annevk commented May 31, 2018

That's why I said forward compatible. I'm not specifying it as SH as that's still somewhat in flux, but ideally if we extend Cross-Origin-Resource-Pollicy to accept more than two values going forward, we can adopt SH for it.

And I was also thinking that if identifiers were broad enough, we could use them for a variety of CORS headers too, as pretty much all have a very basic syntax where they just take one or more identifier-likes.

@mnot
Copy link
Member

mnot commented Jun 5, 2018

OK. Happy to re-add them if there's desire (although I might distinguish between those we use for dicts and params and those that are allowed in item; will see) -- just ask.

@annevk
Copy link
Author

annevk commented Jun 5, 2018

I think if we could make all of https://fetch.spec.whatwg.org/#http-new-header-syntax structured headers that'd be great.

@mnot mnot changed the title Identifiers that can express all origins Identifiers Jun 28, 2018
@mnot
Copy link
Member

mnot commented Jun 28, 2018

@annevk we took identifiers out of payload because of #505. Would be very keen to hear your thoughts there (or here).

@annevk
Copy link
Author

annevk commented Jul 21, 2018

The way I solved that in https://mimesniff.spec.whatwg.org/#serializing-a-mime-type is by serializing as identifier if there are no problematic code points and as a string otherwise. The parser would always produce a string, too.

I think that's what I'd recommend doing, provided there's agreement to uplift some existing (and recently defined) headers to this scheme.

@mnot mnot closed this as completed in 35c3e25 Oct 8, 2018
mikewest added a commit to w3c/webappsec-fetch-metadata that referenced this issue Oct 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants