-
-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clarify that contentSchema holds a subschema and when/how it applies #1564
base: main
Are you sure you want to change the base?
Conversation
specs/jsonschema-validation.md
Outdated
Since `contentMediaType` is required to provide instruction on how to interpret | ||
string content, the annotation schema produced by this keyword has no meaning if | ||
`contentMediaType` is not present. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would prefer that no annotation is produced at all if contentMediaType
is missing -- in order to discourage structuring schemas in this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that then also mean that identifiers are not to be processed in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that then also mean that identifiers are not to be processed in that case?
IMO, it should always be treated as a normal schema location and therefore always respect identifiers. But, I agree that it shouldn't produce an annotation if it isn't valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we happy with it saying that an annotation SHOULD not be produced (etc.)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be a "MUST".
specs/jsonschema-validation.md
Outdated
|
||
Accessing the schema through the schema location IRI included as part of the | ||
annotation will ensure that it is correctly processed as a subschema. Using the | ||
extracted annotation value directly is only safe if the subschema is an embedded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"safe" and "correctly processed as a subschema" is vague -- can we say something else? I think this is trying to say that the evaluation behaviour won't be reproducable because the schema is evaluated in isolation, rather than in the context of the surrounding dialect and location identifier (from the containing schema's $schema and $id keywords). So how about instead saying something like:
Because this subschema is intended to be processed in isolation, outside of the context of its containing schema, usage of both the
$schema
and$id
keywords is recommended to ensure predictable and reproducable results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This text was already present. I just put it in a new paragraph. I did think it was a bit convoluted. Happy to update it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think it could be okay to use the subschema in its original context, so that should still be addressed. This is what the original discussion was about.
specs/jsonschema-validation.md
Outdated
|
||
Accessing the schema through the schema location IRI included as part of the | ||
annotation will ensure that it is correctly processed as a subschema. Using the | ||
extracted annotation value directly is only safe if the subschema is an embedded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should make this a SHOULD?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you're suggesting. This is informative, not a requirement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So from what I'm reading here, there are edge cases in contentSchema
if the schema doesn't have $id
and $schema
. If that's the case, shouldn't we highlight in more of a "SHOULD" manner, which from what I understand, it something you should do rather than a MUST (something you HAVE to do)? Or maybe "RECOMMENDED" is the right one similar to this case: https://json-schema.org/draft/2020-12/json-schema-validation#section-7.2.2-5?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not an edge case. It just means that if you intend to use the contentSchema
subschema solely in its own context, as you would if you received it as an annotation, then any relative $ref
s will only be resolvable if the subschema has both $schema
and $id
.
This goes back to my comment here where I show a contentSchema
subschema attempting to reference a definition in its parent schema. If you extract the subschema (again, because you've received it as an annotation), then that reference fails.
There's not a best practice here. Both approaches have valid use cases, and schema authors are free to do what makes sense for them. This is merely a caution to schema authors to understand the implications of their approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's obvious to me from @karenetheridge's and your comments that this paragraph isn't clear, so I'll reword it.
@karenetheridge / @jviotti I've rewritten the last paragraph note and added another "editor" footnote that points back to the subject issue linked above. Let me know what you think. |
Reads much better now! |
specs/jsonschema-validation.md
Outdated
Since `contentMediaType` is required to provide instruction on how to interpret | ||
string content, the annotation schema produced by this keyword has no meaning if | ||
`contentMediaType` is not present. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that then also mean that identifiers are not to be processed in that case?
IMO, it should always be treated as a normal schema location and therefore always respect identifiers. But, I agree that it shouldn't produce an annotation if it isn't valid.
@karenetheridge @jdesrosiers @jviotti I believe I've addressed your concerns here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
Closing and reopening to rerun build |
Given the 👍 above, I'm assuming approval.
Co-authored-by: Jason Desrosiers <[email protected]>
015414b
to
f6276dc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, it seems this one slipped through the cracks for me a while ago.
Note that evaluating the `contentSchema` subschema in-place (i.e. as part of its | ||
parent schema) will ensure that it is correctly processed. Independent use of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My initial reaction was that this doesn't make sense. The annotation is just the subschema. It's no longer in-place. It doesn't include the context of where it came from. So, how can it be evaluated in-place? Then it occurred to me that an annotation includes not just it's value, but also the schema location it came from and that location can be used to evaluate the contentSchema
in-place. I don't think most readers are going to be knowledgeable enough to make that leap. This could use some clarification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the footnote ([^7]
) that follows not provide that clarity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, that's not the thing I'm saying needs to be clarified. We say that the annotation is the subschema. We say that the subschema shouldn't be evaluated out of context from where it appeared in the schema and we explain why in footnote 7. What we don't explain is given a subschema without its parent context, how is it even possible to evaluate it in context. The value of the annotation is just the subschema, not the context. We can't evaluate the subschema in context because we don't know the context in which it needs to be evaluated. I hope that makes sense this time.
Of course the solution is that the location of the annotation keyword in the schema is how you know the context, but that's not intuitive. This is the only annotation where the location of the keyword in the schema is useful or necessary to know. Usually, we only care about the value of the annotation. In this case, we need to know the value and the schema location of the annotation. Actually, when used correctly (in context), the value of the annotation is useless and it's the schema location that the user actually uses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only annotation where the location of the keyword in the schema is useful or necessary to know. Usually, we only care about the value of the annotation.
This is incorrect. Annotation location has always been useful, especially in cases where you receive annotations from the same keyword in different locations, e.g. from title
. The location allows the consumer to decide which (or both/all) it wants to use. This is Core, where annotations are defined.
We can't evaluate the subschema in context because we don't know the context in which it needs to be evaluated.
How would we not know the context? It's conveyed by the annotation location, which has always been defined to be a part of an annotation.
It's still not clear why you think that the existing text (including the lines following these) is insufficient. It's saying, "Don't just evaluate this annotation value as a schema because it may rely on things that exist externally to it. You probably need to evaluate it where it came from."
It's actually saying all of that, and then the footnote expands on that warning using an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't evaluate the subschema in context because we don't know the context in which it needs to be evaluated.
How would we not know the context? It's conveyed by the annotation location, which has always been defined to be a part of an annotation.
Yes, you're right. I acknowledged that in next paragraph. I was walking you through my thought process when I first read it and what I believe the vast majority of readers will be thinking when they read this section. If it took me a minute to make that connection, most readers won't make it at all. Yes, the concept is unambiguously documented elsewhere, but most readers won't have every detail of JSON Schema memorized and I think this is a pretty esoteric detail.
Co-authored-by: Jason Desrosiers <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not going to fight with you about this. The text is correct even if I think readers will find it confusing. I'll approve the PR with or without the clarification I'm asking for.
Note that evaluating the `contentSchema` subschema in-place (i.e. as part of its | ||
parent schema) will ensure that it is correctly processed. Independent use of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't evaluate the subschema in context because we don't know the context in which it needs to be evaluated.
How would we not know the context? It's conveyed by the annotation location, which has always been defined to be a part of an annotation.
Yes, you're right. I acknowledged that in next paragraph. I was walking you through my thought process when I first read it and what I believe the vast majority of readers will be thinking when they read this section. If it took me a minute to make that connection, most readers won't make it at all. Yes, the concept is unambiguously documented elsewhere, but most readers won't have every detail of JSON Schema memorized and I think this is a pretty esoteric detail.
What kind of change does this PR introduce?
clarification
Issue & Discussion References
Summary
Updates the text for
contentSchema
to indicate that its value is indeed a subschema (and therefore should be treated as such when scanning for identifiers). Also cleans up language around its dependency oncontentMediaType
.I didn't include any explicit text about it containing identifiers. Instead I declare that the value is a subschema and removed the "SHOULD ignore" text discussed in the issue.
Also pertinent to the issue discussion is the final couple sentences (already present), which I
broke out into a new paragraphrewrote to make it more apparent that it is a note of guidance rather than a requirement.Does this PR introduce a breaking change?
No.