clarify that contentSchema holds a subschema and when/how it applies #1564

gregsdennis · 2024-11-21T21:30:22Z

What kind of change does this PR introduce?

clarification

Issue & Discussion References

Closes contentSchema has implementation-defined referencing behavior when contentMediaType is not present #1381
Related to Clarify the handling of "contentSchema" #1288

Summary

Updates the text for contentSchema to indicate that its value is indeed a subschema (and therefore should be treated as such when scanning for identifiers). Also cleans up language around its dependency on contentMediaType.

I didn't include any explicit text about it containing identifiers. Instead I declare that the value is a subschema and removed the "SHOULD ignore" text discussed in the issue.

Also pertinent to the issue discussion is the final couple sentences (already present), which I ~~broke out into a new paragraph~~ rewrote to make it more apparent that it is a note of guidance rather than a requirement.

Accessing the schema through the schema location IRI included as part of the
annotation will ensure that it is correctly processed as a subschema. Using the
extracted annotation value directly is only safe if the subschema is an embedded
resource with both $schema and an absolute IRI $id.

Does this PR introduce a breaking change?

No.

karenetheridge · 2024-11-21T21:58:50Z

specs/jsonschema-validation.md

+Since `contentMediaType` is required to provide instruction on how to interpret
+string content, the annotation schema produced by this keyword has no meaning if
+`contentMediaType` is not present.


I think I would prefer that no annotation is produced at all if contentMediaType is missing -- in order to discourage structuring schemas in this way.

Does that then also mean that identifiers are not to be processed in that case?

Does that then also mean that identifiers are not to be processed in that case?

IMO, it should always be treated as a normal schema location and therefore always respect identifiers. But, I agree that it shouldn't produce an annotation if it isn't valid.

Are we happy with it saying that an annotation SHOULD not be produced (etc.)?

I think it should be a "MUST".

karenetheridge · 2024-11-21T22:02:35Z

specs/jsonschema-validation.md

+
+Accessing the schema through the schema location IRI included as part of the
+annotation will ensure that it is correctly processed as a subschema. Using the
+extracted annotation value directly is only safe if the subschema is an embedded


"safe" and "correctly processed as a subschema" is vague -- can we say something else? I think this is trying to say that the evaluation behaviour won't be reproducable because the schema is evaluated in isolation, rather than in the context of the surrounding dialect and location identifier (from the containing schema's $schema and $id keywords). So how about instead saying something like:

Because this subschema is intended to be processed in isolation, outside of the context of its containing schema, usage of both the $schema and $id keywords is recommended to ensure predictable and reproducable results.

This text was already present. I just put it in a new paragraph. I did think it was a bit convoluted. Happy to update it.

I do think it could be okay to use the subschema in its original context, so that should still be addressed. This is what the original discussion was about.

jviotti · 2024-11-22T11:26:59Z

specs/jsonschema-validation.md

+
+Accessing the schema through the schema location IRI included as part of the
+annotation will ensure that it is correctly processed as a subschema. Using the
+extracted annotation value directly is only safe if the subschema is an embedded


Maybe we should make this a SHOULD?

I'm not sure what you're suggesting. This is informative, not a requirement.

So from what I'm reading here, there are edge cases in contentSchema if the schema doesn't have $id and $schema. If that's the case, shouldn't we highlight in more of a "SHOULD" manner, which from what I understand, it something you should do rather than a MUST (something you HAVE to do)? Or maybe "RECOMMENDED" is the right one similar to this case: https://json-schema.org/draft/2020-12/json-schema-validation#section-7.2.2-5?

It's not an edge case. It just means that if you intend to use the contentSchema subschema solely in its own context, as you would if you received it as an annotation, then any relative $refs will only be resolvable if the subschema has both $schema and $id.

This goes back to my comment here where I show a contentSchema subschema attempting to reference a definition in its parent schema. If you extract the subschema (again, because you've received it as an annotation), then that reference fails.

There's not a best practice here. Both approaches have valid use cases, and schema authors are free to do what makes sense for them. This is merely a caution to schema authors to understand the implications of their approach.

It's obvious to me from @karenetheridge's and your comments that this paragraph isn't clear, so I'll reword it.

gregsdennis · 2024-11-23T22:50:54Z

@karenetheridge / @jviotti I've rewritten the last paragraph note and added another "editor" footnote that points back to the subject issue linked above. Let me know what you think.

jviotti · 2024-11-25T13:39:33Z

Reads much better now!

specs/jsonschema-validation.md

jdesrosiers · 2024-11-27T21:02:57Z

specs/jsonschema-validation.md

+Since `contentMediaType` is required to provide instruction on how to interpret
+string content, the annotation schema produced by this keyword has no meaning if
+`contentMediaType` is not present.


Does that then also mean that identifiers are not to be processed in that case?

IMO, it should always be treated as a normal schema location and therefore always respect identifiers. But, I agree that it shouldn't produce an annotation if it isn't valid.

gregsdennis · 2024-11-28T08:19:05Z

@karenetheridge @jdesrosiers @jviotti I believe I've addressed your concerns here.

jviotti

Looks good to me

gregsdennis · 2025-01-17T11:17:48Z

Closing and reopening to rerun build

Given the 👍 above, I'm assuming approval.

Co-authored-by: Jason Desrosiers <[email protected]>

jdesrosiers

Sorry, it seems this one slipped through the cracks for me a while ago.

specs/jsonschema-validation.md

jdesrosiers · 2025-01-20T20:46:00Z

specs/jsonschema-validation.md

+Note that evaluating the `contentSchema` subschema in-place (i.e. as part of its
+parent schema) will ensure that it is correctly processed. Independent use of


My initial reaction was that this doesn't make sense. The annotation is just the subschema. It's no longer in-place. It doesn't include the context of where it came from. So, how can it be evaluated in-place? Then it occurred to me that an annotation includes not just it's value, but also the schema location it came from and that location can be used to evaluate the contentSchema in-place. I don't think most readers are going to be knowledgeable enough to make that leap. This could use some clarification.

Does the footnote ([^7]) that follows not provide that clarity?

No, that's not the thing I'm saying needs to be clarified. We say that the annotation is the subschema. We say that the subschema shouldn't be evaluated out of context from where it appeared in the schema and we explain why in footnote 7. What we don't explain is given a subschema without its parent context, how is it even possible to evaluate it in context. The value of the annotation is just the subschema, not the context. We can't evaluate the subschema in context because we don't know the context in which it needs to be evaluated. I hope that makes sense this time.

Of course the solution is that the location of the annotation keyword in the schema is how you know the context, but that's not intuitive. This is the only annotation where the location of the keyword in the schema is useful or necessary to know. Usually, we only care about the value of the annotation. In this case, we need to know the value and the schema location of the annotation. Actually, when used correctly (in context), the value of the annotation is useless and it's the schema location that the user actually uses.

This is the only annotation where the location of the keyword in the schema is useful or necessary to know. Usually, we only care about the value of the annotation.

This is incorrect. Annotation location has always been useful, especially in cases where you receive annotations from the same keyword in different locations, e.g. from title. The location allows the consumer to decide which (or both/all) it wants to use. This is Core, where annotations are defined.

We can't evaluate the subschema in context because we don't know the context in which it needs to be evaluated.

How would we not know the context? It's conveyed by the annotation location, which has always been defined to be a part of an annotation.

It's still not clear why you think that the existing text (including the lines following these) is insufficient. It's saying, "Don't just evaluate this annotation value as a schema because it may rely on things that exist externally to it. You probably need to evaluate it where it came from."

It's actually saying all of that, and then the footnote expands on that warning using an example.

We can't evaluate the subschema in context because we don't know the context in which it needs to be evaluated.

How would we not know the context? It's conveyed by the annotation location, which has always been defined to be a part of an annotation.

Yes, you're right. I acknowledged that in next paragraph. I was walking you through my thought process when I first read it and what I believe the vast majority of readers will be thinking when they read this section. If it took me a minute to make that connection, most readers won't make it at all. Yes, the concept is unambiguously documented elsewhere, but most readers won't have every detail of JSON Schema memorized and I think this is a pretty esoteric detail.

Co-authored-by: Jason Desrosiers <[email protected]>

jdesrosiers

I'm not going to fight with you about this. The text is correct even if I think readers will find it confusing. I'll approve the PR with or without the clarification I'm asking for.

jdesrosiers · 2025-01-28T23:43:30Z

specs/jsonschema-validation.md

+Note that evaluating the `contentSchema` subschema in-place (i.e. as part of its
+parent schema) will ensure that it is correctly processed. Independent use of


We can't evaluate the subschema in context because we don't know the context in which it needs to be evaluated.

How would we not know the context? It's conveyed by the annotation location, which has always been defined to be a part of an annotation.

Yes, you're right. I acknowledged that in next paragraph. I was walking you through my thought process when I first read it and what I believe the vast majority of readers will be thinking when they read this section. If it took me a minute to make that connection, most readers won't make it at all. Yes, the concept is unambiguously documented elsewhere, but most readers won't have every detail of JSON Schema memorized and I think this is a pretty esoteric detail.

gregsdennis requested a review from a team November 21, 2024 21:30

gregsdennis self-assigned this Nov 21, 2024

gregsdennis added the core label Nov 21, 2024

gregsdennis added this to the stable-release milestone Nov 21, 2024

karenetheridge previously requested changes Nov 21, 2024

View reviewed changes

jviotti reviewed Nov 22, 2024

View reviewed changes

jdesrosiers reviewed Nov 27, 2024

View reviewed changes

jviotti approved these changes Nov 28, 2024

View reviewed changes

gregsdennis closed this Jan 17, 2025

gregsdennis reopened this Jan 17, 2025

gregsdennis requested review from jdesrosiers and karenetheridge January 17, 2025 11:22

gregsdennis and others added 5 commits January 18, 2025 09:44

clarify that contentSchema holds a subschema and when/how it applies

11216df

apply text wrap

8e0466f

update note about processing contentSchema subschema in context

6342ec3

Update specs/jsonschema-validation.md

a90049e

Co-authored-by: Jason Desrosiers <[email protected]>

contentSchema should not produce an annotation

f6276dc

gregsdennis force-pushed the gregsdennis/contentSchema branch from 015414b to f6276dc Compare January 17, 2025 20:45

jdesrosiers requested changes Jan 20, 2025

View reviewed changes

Update specs/jsonschema-validation.md

87f6201

Co-authored-by: Jason Desrosiers <[email protected]>

jdesrosiers approved these changes Jan 28, 2025

View reviewed changes

gregsdennis merged commit 0e3ef5f into main Feb 14, 2025
6 checks passed

gregsdennis deleted the gregsdennis/contentSchema branch February 14, 2025 07:19

		Note that evaluating the `contentSchema` subschema in-place (i.e. as part of its
		parent schema) will ensure that it is correctly processed. Independent use of

Uh oh!

clarify that contentSchema holds a subschema and when/how it applies #1564

clarify that contentSchema holds a subschema and when/how it applies #1564

Uh oh!

Conversation

gregsdennis commented Nov 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What kind of change does this PR introduce?

Issue & Discussion References

Summary

Does this PR introduce a breaking change?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gregsdennis commented Nov 23, 2024

Uh oh!

jviotti commented Nov 25, 2024

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gregsdennis commented Nov 28, 2024

Uh oh!

jviotti left a comment

Choose a reason for hiding this comment

Uh oh!

gregsdennis commented Jan 17, 2025

Uh oh!

jdesrosiers left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jdesrosiers left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

gregsdennis commented Nov 21, 2024 •

edited

Loading

jdesrosiers left a comment •

edited

Loading