Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add $self for self-identifying documents #4389

Open
wants to merge 10 commits into
base: v3.2-dev
Choose a base branch
from

Conversation

handrews
Copy link
Member

@handrews handrews commented Feb 28, 2025

See the proposal for background information.

This is a rather minimal approach, as @karenetheridge and I plan to work together on a more thorough revamp of the document parsing / reference resolution sections.

This adds $self as a way for a document to define its own URI for use in reference targets, and as the base URI for relative URI references in the document.

This does not impact the resolution of relative API URLs. [NOTE: I'm not entirely sure about this, but it seems more useful this way to allow multiple deployed locations of an OAD to correspond to multiple deployments of the API.)

Tick one of the following options:

  • schema changes are included in this pull request
  • schema changes are needed for this pull request but not done yet
  • no schema changes are needed for this pull request

@handrews handrews added enhancement re-use: ref/id resolution how $ref, operationId, or anything else is resolved labels Feb 28, 2025
@handrews handrews added this to the v3.2.0 milestone Feb 28, 2025
@handrews handrews requested review from a team as code owners February 28, 2025 15:07
Copy link
Contributor

@lornajane lornajane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty solid, but could we introduce use of $self in one or two examples so people realise it's there?

@ralfhandl
Copy link
Contributor

Looks good. An example would be nice.

@handrews
Copy link
Member Author

handrews commented Mar 6, 2025

@lornajane @ralfhandl I thought I did include one but apparently I did that on an earlier attempt on a different branch. I'll port it over in the morning.

Copy link
Member

@karenetheridge karenetheridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments/questions throughout.

src/oas.md Outdated
In practice, this is usually the retrieval URI of the document, which MAY be determined based on either its current actual location or a user-supplied expected location.
The document's base URI MUST also be used to resolve a relative `$self` URI reference.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add "...which is then used to resolve relative URI references in other Objects, as above" ?

An example to use lower down could be: the document has "$self": "/api" in it, but at runtime the base URI is provided as as "https://dev.example.com" -- therefore the effective value of $self for this document (the URI to use as the base for all future relative resolutions) is "https://dev.example.com/api".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused about this addition since in the description of the pull request there's a mention that the $self value does NOT affect how relative URLs are resolved. They should still be relative to wherever-you-found-this, as I understood it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"wherever-you-found-this" is the URI in $self, is it not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lornajane I think there might be some terminology muddiness here- URIs (meaning OpenAPI Description URIs) and URLs (meaning API URLs) are resolved differently, and have been since OAS v3.1.0.

@karenetheridge no, "wherever-you-found-this" is the retrieval URI per RFC3986 §5.1.3. $self is the URI defined in content per RFC3986 §5.1.1. Since $self is a URI, it may not be a sort of URI that indicates any sort of "where", while the retrieval URI is more-or-less by definition a URL (the "more-or-less" is because there are some URI schemes that do not inherently indicate a location, but have a well-defined process to map to a location that might change over time, e.g. the doi: scheme).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the $self value does NOT affect how relative URLs are resolved. They should still be relative to wherever-you-found-this, as I understood it.

As per the added examples (thank you!):

  • the retrieval URI is the "wherever you found it" URI (or some other URI that is provided to the implementation out-of-band): e.g. "here is an OAS document that you shall load, and I shall call it Squishy"
  • relative $self URIs are also resolved using the retrieval URI
  • relative $refs within the document (OAS sections and JSON Schemas) and relative $ids within schemas within the document are resolved using the closest identifier (which is either an $id in a schema, or the (resolved form of the) document's $self URI
  • server urls are resolved using the retrieval URI and NOT the $self URI.
  • when matching HTTP requests against entries in /paths, we find the closest servers url (operation, path-item, global), and prepend the resolved form of it to the path template (e.g. for path-item /paths/foo/{foo_id}, the template is /foo/{foo_id}.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karenetheridge If you substitute "externally determined URI per RFC3986 §5.1.2–5.1.4" in place of "retrieval URI" then this is correct. It just happens that the retrieval URI (RFC3986 §5.1.3) is by far the most likely of these, and the one worth discussing. But you'll notice I mention the other possibilities as well.

For example, if you are sent an application/openapi+yaml document as one part in a multipart/related archive per RFC2557, then the relevant URI would not be the retrieval URI (which probably doesn't meaningfully exist) but would be the Content-Location header of the part containing the application/openapi+yaml document, per RFC3986 §5.1.2 (base URI from encapsulating entity).

But going through the details of all of this is not desirable because a.) it's covered by RFC3986 which we normatively cite, and b.) most people will never care.

@handrews
Copy link
Member Author

@lornajane @ralfhandl @karenetheridge I ended up reworking this a bit as I added examples.

Copy link
Contributor

@ralfhandl ralfhandl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, after correcting the resolved URLs in the example.

ralfhandl
ralfhandl previously approved these changes Mar 21, 2025
ralfhandl
ralfhandl previously approved these changes Mar 21, 2025
@handrews

This comment was marked as off-topic.

@handrews

This comment was marked as off-topic.

@ralfhandl

This comment was marked as off-topic.

@handrews

This comment was marked as off-topic.

@handrews

This comment was marked as off-topic.

@handrews

This comment was marked as off-topic.

@karenetheridge

This comment was marked as off-topic.

Copy link
Member

@karenetheridge karenetheridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's all my comments for this pass.

They will read a bit confusingly though because the order in which they appear in the document is nearly the reverse of that in which I wrote them -- as I had to get down to your examples before I really understood what you're going for, and then I had to go back and drop in some questions higher up where the concepts are first mentioned! (sorry!)

Actually never mind, it looks like github has changed how it does things since I last had a similar complaint -- and the comments are added here in chronological order, not in the order in which they appear in the file! yay github.

src/oas.md Outdated
@@ -342,6 +416,7 @@ This is the root object of the [OpenAPI Description](#openapi-description).
| Field Name | Type | Description |
| ---- | :----: | ---- |
| <a name="oas-version"></a>openapi | `string` | **REQUIRED**. This string MUST be the [version number](#versions) of the OpenAPI Specification that the OpenAPI Document uses. The `openapi` field SHOULD be used by tooling to interpret the OpenAPI Document. This is _not_ related to the API [`info.version`](#info-version) string. |
| <a name-"oas-self"></a>$self | `URI-reference` (without a fragment) | Sets the URI of this document, which also serves as its base URI in accordance with [RFC 3986 §5.1.1](https://www.rfc-editor.org/rfc/rfc3986#section-5.1.1); the value MUST NOT be the empty string and MUST NOT contain a fragment (even if the fragment is empty). Implementations MUST support referencing a document by the resolved URI defined by this field. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure of the exact wording, but can we somehow make it clear that "referencing" here means "via a $ref keyword", not "Locatable through an HTTP request at this location"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressing this will probably want to change a few other locations in the document too (probably in the URI-vs-URL section, and when we expect to find something on the network vs just access something in memory that is known to have a particular URI), and that can be done in a review pass before 3.2.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karenetheridge I believe my latest commit addressed this here in this fixed field row.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which commit is that? the latest commits all seem to be stylistic changes only.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karenetheridge this commit, on this line. It does not specifically talk about individual fields because it applies to all API Description URIs. But the section to which it links talks about which fields are API Description URIs.

@lornajane

This comment was marked as off-topic.

@ralfhandl ralfhandl requested a review from a team March 28, 2025 08:51
karenetheridge
karenetheridge previously approved these changes Apr 1, 2025
Copy link
Member

@karenetheridge karenetheridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(modulo Ralf's formatting edits) I am happy to sign off on this!

ralfhandl
ralfhandl previously approved these changes Apr 3, 2025
@lornajane
Copy link
Contributor

General consensus from the technical meeting today is that this pull request is close, if anyone can help with final reviews, we'd appreciate it!

src/oas.md Outdated
In practice, this is usually the retrieval URI of the document, which MAY be determined based on either its current actual location or a user-supplied expected location.
Relative URI references are resolved using the appropriate base URI, which MUST be determined in accordance with [[RFC3986]] [Section 5.1.1 – 5.1.4](https://tools.ietf.org/html/rfc3986#section-5.1.1).
RFC3986 Section 5.1.1 requires determining the base URI from within a resource's contents, which for the OAS means the `$self` field of the [OpenAPI Object](#openapi-object) for an [OpenAPI Document](#openapi-document), or the `$id` JSON Schema keyword in [Schema Objects](#schema-object).
Within an OpenAPI Document, a Schema Object that does not have its base URI set by `$id` uses `$self` the same as any other Object, treating the OpenAPI Document as the "encapsulating entity" in accordance with RFC3986 Section 5.1.2.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line reads to me as to suggest that the schema uses a $self keyword.
Also the qualifier "that does not have its base URI set by $id" seems not quite right - it still uses the OAD's $self as base URI, e.g. to resolve its $id if that is relative. Unless that's meant to refer to a parent schema's $id?

Suggested change
Within an OpenAPI Document, a Schema Object that does not have its base URI set by `$id` uses `$self` the same as any other Object, treating the OpenAPI Document as the "encapsulating entity" in accordance with RFC3986 Section 5.1.2.
Within an OpenAPI Document, a Schema Object has a base URI that comes from the OAD's `$self`, initially, treating the OpenAPI Document as the "encapsulating entity" in accordance with RFC3986 Section 5.1.2. Schema resources change this base URI with their `$id` for their subschemas.

Best I came up with but probably could be better ... not sure "initially" is the right word, I mean "as you descend".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@notEthan I see your point here, thanks for the suggestion. Let me think a bit more on this wording.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@notEthan I tried some new wording here (in this commit)

openapi: 3.2.0
$self: /openapi
info:
version: 1.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

info requires a title (here and in several other examples)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UGH. Why is the Info Object like it is?

* Implementations that do not support direct retrieval, which requires additional dependencies and security considerations
* Network configurations or conditions that prevent direct retrieval
* Test configurations that need to simulate the document being hosted in a production location
* Documents that exist only in-memory and have no readily identifiable location (although for a single document without references to other documents, an application-specific default base URI in accordance with RFC3986 Section 5.1.4 would also be suitable)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this goes somewhat into the weeds of circumstances when an implementation might need to be told the retrieval URI. But it seems to me that if there is an intended retrieval URI, well, the implementation always needs to be told that. If the implementation is itself retrieving the document from that URI, it must be given that URI for actual retrieval. If the implementation isn't doing the retrieval or is retrieving from a different URI, whatever the reason, it must be given the nominal retrieval URI to use for resolution. Network configuration, test environment, etc are relevant considerations for retrieval, but regardless how those are set up, the base URI has to be given.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@notEthan this is there because I have a decade of experience of people not understanding the need for such things. It took us numerous revisions of JSON Schema to get the idea across (even though the actual point never changed). This part is staying.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mean to suggest it be removed, but that the listed cases kind of eclipse what seems to me the broader, more general point that an implementation always needs to know a base URI if $self/other URIs need to be resolved.
I'm good if this is considered fine as is, not trying to push further for a change, just to clarify my meaning.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@notEthan if people want to understand base URIs better, they can read RFC3986. They've got a nice ASCII art precedence diagram :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@notEthan perhaps more seriously, so much of the OAS needs a full rewrite for flow and clarity. It's extremely hard to do little changes here and there and get the big-picture flow you want, but we don't have the time/budget to do that in 3.x (we discussed it for data modeling in a recent TDC call and opted against it). This sort of thing should be presented more clearly in Moonwalk, but I just don't have the time to do the level of rewriting that would be needed to truly unravel this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@notEthan I started trying to rework things for better flow, and (as with the data modeling) it would be a couple of weeks of work to do it properly. So I think the best thing to do here is make sure that this wording is clear enough, and we can keep working at it in the future.

description: The test API on this device
```

For API URLs, the `$self` field, which identifies the OpenAPI Document, is ignored, and the retrieval URL is used instead. This produces a normalized production URL of `https://device1.example.com`, and a normalized test URL of `https://device1.example.com/test`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't make much sense to me.

The OAD can have a retrieval URI, and it can have a $self URI. The base URI for objects in the OAD is $self, if present. But servers doesn't use this base URI, it uses the base URI from outside the OAD. Is the OAD's base URI expected to be "closer" to the API's base URL than the $self URI?

I don't have experience with the deployment problems that $self solves, maybe if I did this would make more sense, I suspect. But the reasoning for using one base URI over the other in servers vs elsewhere is not at all apparent to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API description URIs and API endpoint (or other) URLs are just different things. $self makes sense for API description URIs. I honestly don't know what makes the most sense for API endpoint URLs, but I do know that tying them to $self is problematic because an API description has a single most-relevant URI ($self if it is present, then RFC3986 §5.1.2-5.1.4 in order if not), while many API instances, each deployed at a different location, can exist while being described by the same OAD.

So it does not make sense to combine something that has a singular identity with something that can have arbitrarily many instances. This PR does not change anything about how servers are handled, so if you want to see something different with servers, that should be a separate discussion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@notEthan I do agree that this is confusing, and I'm open to ways to explain it better. But fundamentally there are two different things here.

Or we could staple them together and say that the Server Object's url uses $self and the rest of RFC3986, but that would technically, in the most obscure way possilble, be a breaking change as the Server Object has never mentioned RFC3986 for its base URI determination. Unlike API Description URIs which state that they rely on RFC3986 in OAS 3.1.0.

The obscure way it could "break" is that if somehow someone was using a document that, per RFC3986 §5.1.2 would get its base URI from an encapsulating entity, but was using the retrieval URI (RFC3986 §5.1.3) with the Server Object because it explicitly says "document location", then shifting the Server Object over to RFC3986 (where $self is based on §5.1.1) would be a breaking change. I have no idea how that would even be set up since usually having an encapsulating entity means no direct retrieval URI, but in theory it's possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I certainly agree/understand that the URIs of the API description and objects within are quite different than (or unrelated to) API server/endpoint URLs. My hesitation is on the idea that the retrieval URI for the OAD is likely to be more closely related to the server URL than $self (I've generally seen the OAD served from one place accompanying documentation, not with each deployment, and aren't server variables the better mechanism for deployment-specific URLs?).

Or rather, my hesitation as an implementer of tooling is on whether that difference is worth implementing - I have one base URI for objects in the OAD (including servers), which will be changing to inherit from $self, and no place to put another base URI. It's not difficult to implement but it adds some complexity in a place I have doubts on the real utility of it.

This PR does not change anything about how servers are handled, so if you want to see something different with servers, that should be a separate discussion.

Well, it changes the base URI of everything except servers, I think discussing that exception belongs here.

Having said that / been heard on the above, I put my skepticism aside and accept this is worth having servers be an exception and use the document retrieval URI. In the case of each deployment serving the OAD, although I haven't encountered that, it is more correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've generally seen the OAD served from one place accompanying documentation, not with each deployment, and aren't server variables the better mechanism for deployment-specific URLs?

There are other use cases, such as APIs for device management that are deployed on each device, often with the OAD because the larger internet might not be accessible from the network containing the device.

I find that a lot of people think in terms of consumer-facing APIs and not enterprise APIs that are more likely to be deployed in complex network configurations. But complex network configurations are real use cases, too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@notEthan BTW I would be perfectly happy to continue the discussion of how to handle server URLs for a potential follow-on change. I just opted to do the least amount of change there, in part because it's not an area in which I feel exceptionally confident. I do want to support enterprise multi-deployment use cases, but I don't want to do that at the expense of tons of commentary.

Anyway, I just wanted to encourage further discussion- I know I pushed back on several of your comments here a bit hard, but that's more about me being distracted by other Life Stuff™ than anything else, and I do apologize for not having the bandwidth to reply with more consideration.


Relative `$self` values are often used for APIs deployed in multiple locations, such as a device management API that is hosted on each device.

In the next example, the retrieval URI is irrelevant, because `$self` is already a full URI:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/full URI/absolute URI/ ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@notEthan it is an absolute URI which is a subset of full URIs, but the important part is that it is full and therefore does not need resolving. The absolute-ness is incidental in this particular case, and I wanted to emphasize the part that is most relevant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I was suggesting a more precise term, but it sounds like 'full URI' is a precise term, but one I am not familiar with. I am looking but not finding quite what a full URI is or how it relates to absolute, could you point me to that information?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The precise term is "URI", which is not very useful as most people think "URI" means "URI-reference." There is no more precise term than that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And in particular, the OAS uses "URI" to mean "URI-reference" so we're left without proper terminology.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@notEthan I have kept thinking about this. I've long been frustrated at the lack of anything more precise than "URI" for what we want. It plagued us with the JSON Schema spec as well, and that's where "full URI" came from. Although at least with JSON Schema we used "URI-reference" where appropriate. OAS further overloading "URI" makes this harder.

idk, maybe absolute URI would be better as it is also accurate and not overloaded? Meh.

This adds `$self` as a way for a document to define its own URI
for use in reference targets, and as the base URI for relative URI
references in the document.

This does not impact the resolution of relative API URLs.
ralfhandl
ralfhandl previously approved these changes Apr 4, 2025
handrews and others added 9 commits April 4, 2025 08:03
Co-authored-by: Ralf Handl <[email protected]>
This ties the `$self` behavior more directly to the sections
on API Description URI usage, including examples, and also
expands on the use cases for manually providing a retrieval URI.
Also tweak the wording a bit as a result, and ran format-markdown.
Co-authored-by: Ralf Handl <[email protected]>
@handrews
Copy link
Member Author

handrews commented Apr 4, 2025

The latest force-push is just a rebase that resolves conflicts with recent merged PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement re-use: ref/id resolution how $ref, operationId, or anything else is resolved
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OAS3 Tag/template to define external-components URL
5 participants