-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restore pagination information #496
Conversation
Notably "breaking" changes here are that the registry may impose a limit on the number of tags it returns. This was restored from b4e9833, where it applied to Catalog, which was removed, but given that many registries have `adopted this behavior already, we should document it correctly. This also restores language that clients SHOULD always use the Link header when present. This was restored from c90b0f1. Signed-off-by: Jon Johnson <[email protected]>
Does this text now allow a registry to return more than |
I don't think so. The relevant text is:
This would only allow for <= n. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a first pass, it may still need some tweaking later.
The big issue is with the deletion of:
The response to such a request MAY return fewer than
<int>
results, but only when the total number of tags attached to the repository is less than<int>
.
Otherwise, the response MUST include<int>
results.
and addition of:
Paginated results can be retrieved by adding an
n
parameter to the request URL, declaring that the response SHOULD be limited ton
results.
This reads as if registries are allowed to return more than n
results.
I think we should specify that registries can return at most n
results. If a registry returns fewer than n
results, either it MUST be the end of the list, or there MUST be a pagination Link
header. That would be the least breaking of the changes.
I think it's also worth calling out that a registry that returns a Link
header on some tag listing requests MUST do so for all tag listing requests that do not return the end of the list. Otherwise it is not safe for clients to assume:
If the header is not present, the client can assume that all results have been received.
The entries in the response start _after_ the term specified by `last`, up to `n` entries. | ||
|
||
The client can then issue the request with the above value from the `Link` header, receiving the values _v3_ and _v4_. | ||
Note that `n` may change on the second to last response or be fully omitted, depending on the server implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why "second to last response"? Why does omitting the parameter by the client depend on the server implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the second to last response, the n parameter could be adjusted by the server to match the remaining number of tags to be delivered in the last response. I would, however, say that this is more difficult for a registry to implement than just using the name of the last tag in the current result set.
``` | ||
GET /v2/<name>/tags/list?n=<n from the request>&last=<last tag value from previous response> | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This GET request isn't using the Link
header as the above paragraph suggests. It might be easier to include this as an option for starting the request in the middle of the list, around line 658 and then follow with two sections, one for the Link header (clients should prefer this if servers implement it), or manually generating requests with the last
parameter (for clients with 1.0 server compatibility).
@@ -639,6 +644,57 @@ If the [referrers API](#listing-referrers) returns a 404, the client MUST fallba | |||
The response SHOULD be an image index with the same content that would be expected from the referrers API. | |||
If the response to the [referrers API](#listing-referrers) is a 404, and the tag schema does not return a valid image index, the client SHOULD assume there are no referrers to the manifest. | |||
|
|||
##### Pagination | |||
|
|||
Paginated results can be retrieved by adding an `n` parameter to the request URL, declaring that the response SHOULD be limited to `n` results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pagination can also occur without the n
parameter if servers limit the number of entries in a response.
Also, guidelines on what the expected behavior is if the tag list changes in between, due to new images being pushed or deleted for that matter? |
If there are indeed more results, the URL for the next block is encoded in an [RFC5988](https://tools.ietf.org/html/rfc5988) `Link` header, as a "next" relation. | ||
If the header is not present, the client can assume that all results have been received. | ||
|
||
> __NOTE:__ In the request template above, note that the brackets are required. For example, if the url is `http://example.com/v2/hello-world/tags/list?n=20&last=b`, the value of the header would be `<http://example.com/v2/hello-world/tags/list?n=20&last=b>; rel="next"`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe include also an example with a relative reference, such as the server returning just </v2/hello-world/tags/list?n=20&last=b>; rel="next"
. Which I would expect to be more common, as the reply may be routed a few times until getting to the client and the ip/hostname/port the server is aware of may be different from the one the client should make the request to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I like this change. For a public registry service, the wording of the current spec is essentially impossible to implement without opening an easy to exploit DoS vector:
The response to such a request MAY return fewer than results, but only when the total number of tags attached to the repository is less than . Otherwise, the response MUST include results.
So the proposed changes are alleviating that. The one thing I am missing here is details around how a server can/should tell the client that it requested too many tags and in which circumstances.
In line 533 it says:
The registry implementation MAY impose a maximum limit and return a partial set with pagination links.
See Pagination for information on how to handle pagination links.
But there is no detail provided in that section about the return code. I would suggest to include the verbiage from @sudo-bmitch's earlier proposal: https://github.com/opencontainers/distribution-spec/pull/470/files#diff-bc6661da34ecae62fbe724bb93fd69b91a7f81143f2683a81163231de7e3b545R548
Allow to return a HTTP 400 response and make it so that clients rely on a reasonable minimum amount of tags to never trigger this response, like 20.
The entries in the response start _after_ the term specified by `last`, up to `n` entries. | ||
|
||
The client can then issue the request with the above value from the `Link` header, receiving the values _v3_ and _v4_. | ||
Note that `n` may change on the second to last response or be fully omitted, depending on the server implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the second to last response, the n parameter could be adjusted by the server to match the remaining number of tags to be delivered in the last response. I would, however, say that this is more difficult for a registry to implement than just using the name of the last tag in the current result set.
I don't think any of this is ambiguous and we should just merge it. If someone else wants to take a stab at it, I'll approve anything. |
@jonjohnsonjr Was this PR accidentally closed, or did you have something else in mind? |
I'm interpreting between the lines a little bit (so massive apologies to Jon if I'm inaccurate here), but I think it was probably closed because this isn't actually a change to the spec, but restoring content that was in the spec and got removed during refactoring (see #443; so he probably didn't want to spend too many cycles discussing/nit picking the previous behavior/design choices that are already implemented in many registries, and documenting that interop is really why this spec exists 😅). |
Got it. Has this revert happened yet? Because the spec in its current form is currently really problematic with regards to pagination: https://github.com/opencontainers/distribution-spec/blob/main/spec.md?plain=1#L532-L533
This essentially opens up a DoS vector if the client can expect to put arbitrarily high values in |
Can we re-address this with a fresh PR and then afterwards cut a 1.1.1? |
Notably "breaking" changes here are that the registry may impose a limit on the number of tags it returns.
This was restored from b4e9833, where it applied to Catalog, which was removed, but given that many registries have `adopted this behavior already, we should document it correctly.
This also restores language that clients SHOULD always use the Link header when present.
This was restored from c90b0f1.