Skip to content

First pass at a media types registry. #4517

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: gh-pages
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions _config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,11 @@ collections:
name: Format Registry
output: true
permalink: /registry/:collection/:title
media-type:
slug: media-type
name: Media Type Registry
output: true
permalink: /registry/:collection/:title
extension:
slug: extension
name: Specification Extension Registry
Expand Down
22 changes: 22 additions & 0 deletions _includes/media-type-entry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# <a href=".">{{ page.collection | replace 'media-type', 'Media Type' | replace '-', ' ' }}</a>

## {{ page.description }}

**[Media Type](https://spec.openapis.org/oas/latest.html#media-types):** `{{ page.media_type }}` {% if value.unregistered %}_unregistered_ {% endif %} ([{{ page.specification.name }}]({{ page.specification.url }}))

**OAS Reference:** [{{ page.reference.section }}](https://spec.openapis.org/oas/latest.html#{{ page.reference.anchor }})

{{ include.summary }}

{% if page.issue %}
### GitHub Issue

* [#{{ page.issue }}](https://github.com/OAI/OpenAPI-Specification/issues/{{ page.issue }})
{% endif %}

{% if page.remarks %}
### Remarks

{{ page.remarks }}
{% endif %}

20 changes: 20 additions & 0 deletions registries/_media-type/application_json_seq.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
owner: handrews
issue:
description: JSON Text Sequences
specification:
name: RFC7464
url: https://www.rfc-editor.org/rfc/rfc7464.html
media_type: application/json_seq
reference:
section: Sequential JSON
anchor: sequential-json
versions: "3.2+"
layout: default
---

{% capture summary %}
JSON Text Sequences uses the same approach as all sequential JSON media types.
{% endcapture %}

{% include media-type-entry.md summary=summary remarks=remarks %}
21 changes: 21 additions & 0 deletions registries/_media-type/application_jsonl.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
owner: handrews
issue:
description: JSON Lines
specification:
name: JSONL
url: https://jsonlines.org/
media_type: application/jsonl
media_type_unregistered: true
reference:
section: Sequential JSON
anchor: sequential-json
versions: "3.2+"
layout: default
---

{% capture summary %}
JSON Lines uses the same approach as all sequential JSON media types.
{% endcapture %}

{% include media-type-entry.md summary=summary remarks=remarks %}
19 changes: 19 additions & 0 deletions registries/_media-type/application_octet-stream.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
owner: handrews
issue:
description: Binary or Unknown
media_type: application/octet-stream
specification:
name: RFC2046 §4.5.1
url: https://www.rfc-editor.org/rfc/rfc2046.html#section-4.5.1
reference:
section: Working with Binary Data
anchor: working-with-binary-data
layout: default
---

{% capture summary %}
Binary data (also including `image/*`, `video/*`, `audio/*` and other binary media types) is modeled using an empty Schema Object, in accordance with JSON Schema's guidance regarding [non-JSON instances](https://www.ietf.org/archive/id/draft-bhutton-json-schema-01.html#name-non-json-instances). Note that as specified in the linked reference section ("Working with Binary Data"), modeling binary data that has been encoded into a string is handled differently from raw binary data, with two variations: One when an Encoding Object is involved, and one when no Encoding Object is involved.
{% endcapture %}

{% include media-type-entry.md summary=summary remarks=remarks %}
21 changes: 21 additions & 0 deletions registries/_media-type/application_x-ndjson.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
owner: handrews
issue:
description: Newline Delimited JSON
specification:
name: NDJSON
url: https://github.com/ndjson/ndjson-spec/blob/master/README.md
media_type: application/x-ndjson
media_type_unregistered: true
reference:
section: Sequential JSON
anchor: sequential-json
versions: "3.2+"
layout: default
---

{% capture summary %}
Newline Delimited JSON uses the same approach as all sequential JSON media types.
{% endcapture %}

{% include media-type-entry.md summary=summary remarks=remarks %}
19 changes: 19 additions & 0 deletions registries/_media-type/application_x-www-form-urlencoded.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
owner: handrews
issue:
description: URL-Encoded Forms
media_type: application/x-www-form-urlencoded
specification:
name: WHATWG URL
url: https://url.spec.whatwg.org/#application/x-www-form-urlencoded
reference:
section: Support for x-www-form-urlencoded Request Bodies
anchor: support-for-x-www-form-urlencoded-request-bodies
layout: default
---

{% capture summary %}
URL-Encoded forms use the Encoding Object to control how the JSON-like structure defined by the Schema Object maps to the URL query string-like format. Note that this is separate from how URL query parameters are managed, which is done with the Parameter Object.
{% endcapture %}

{% include media-type-entry.md summary=summary remarks=remarks %}
19 changes: 19 additions & 0 deletions registries/_media-type/application_xml.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
owner: handrews
issue:
description: XML
media_type: application/xml
specification:
name: RFC7303
url: https://www.rfc-editor.org/rfc/rfc7303.html
reference:
section: XML Object
anchor: xml-object
layout: default
---

{% capture summary %}
XML is modeled using the OAS's `xml` extension keyword for JSON Schema, which has an XML Object as its value. This allows fine-grained control over how each part of the JSON Schema description maps to XML elements or attributes.
{% endcapture %}

{% include media-type-entry.md summary=summary remarks=remarks %}
19 changes: 19 additions & 0 deletions registries/_media-type/multipart_form-data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
owner: handrews
issue:
description: Multipart Form Data
media_type: multipart/form-data
specification:
name: RFC7578
url: https://www.rfc-editor.org/rfc/rfc7578.html
reference:
section: Encoding multipart Media Types
anchor: encoding-multipart-media-types
layout: default
---

{% capture summary %}
Multipart forms use the Encoding Object to control how the JSON-like structure defined by the Schema Object maps to each part. Multipart media types that do not use named parts cannot be handled with this technique, although it may be possible to use `Content-Disposition: form` with a name parameter with such media types, but as no specification recommends this, support is unlikely to be dependable.
{% endcapture %}

{% include media-type-entry.md summary=summary remarks=remarks %}
21 changes: 21 additions & 0 deletions registries/_media-type/text_event-stream.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
owner: handrews
issue:
description: SSE Events
specification:
name: WHATWG HTML
url: https://html.spec.whatwg.org/multipage/iana.html#text/event-stream
media_type: text/event-stream
media_type_unregistered: true
reference:
section: Server-Sent Event Streams
anchor: server-sent-event-streams
versions: "3.2+"
layout: default
---

{% capture summary %}
Event streams build on the sequential media type support used by sequential JSON media types by further defining a mapping of the individual event format into the Schema Object's data model.
{% endcapture %}

{% include media-type-entry.md summary=summary remarks=remarks %}
19 changes: 19 additions & 0 deletions registries/_media-type/text_plain.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
---
owner: handrews
issue:
description: Plain Text
media_type: text/plain
specification:
name: RFC2046
url: https://www.rfc-editor.org/rfc/rfc2046.html
reference:
section: Encoding Object
anchor: encoding-object
layout: default
---

{% capture summary %}
Plain text is modeled as a single string. Note that unlike JSON strings, the contents of the string representing the plain text are not quoted when serializing to a document. While a Schema Object of `{type: string, const: foo}` for JSON validates the JSON value `"foo"`, for plain text it validates `foo`, without quotes.
{% endcapture %}

{% include media-type-entry.md summary=summary remarks=remarks %}
46 changes: 46 additions & 0 deletions registry/media-type.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
---
title: Media Type Registry
layout: default
permalink: /registry/media-type/index.html
parent: Registry
---

# Media Type Registry

This registry defines how to use the Schema Object, Media Type Object, and in some cases other Objects to model media types other than `application/json` or media types using a `+json` suffix.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This registry defines how to use the Schema Object, Media Type Object, and in some cases other Objects to model media types other than `application/json` or media types using a `+json` suffix.
This registry defines how to use the Schema Object, Media Type Object, and in some cases other Objects to model media types other than `application/json` including media types using a `+json` suffix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duncanbeevers I don't think +suffix media types are considered subsets of the larger type? @darrelmiller might know for certain. But I think the semantics (indicated by the part before the +) are considered more important than the syntax (after the +) which is just a "here's how you can do partial processing" bit of info.

Copy link
Contributor

@philsturgeon philsturgeon Apr 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its been pretty common for folks to see application/something+json as the subtype is more like metadata (application/vnd.google+json would be type/metadata+subtype) but in reality the subtype is more important, and the suffix is on a short list of defined ways to process that, one of which is +json meaning its really type/subtype+waytoprocessit.

If you believe the former then "including" makes sense, but if you believe the latter then the latter then "or" makes more sense.

Some relevant RFC over here: https://datatracker.ietf.org/doc/html/rfc6838#section-4.2.8

4.2.8. Structured Syntax Name Suffixes

XML in MIME [RFC3023] defined the first such augmentation to the
media type definition to additionally specify the underlying
structure of that media type. To quote:

  This document also standardizes a convention (using the suffix
  '+xml') for naming media types ... when those media types
  represent XML MIME (Multipurpose Internet Mail Extensions)
  entities.

That is, it specified a suffix (in that case, "+xml") to be appended
to the base subtype name.

Since this was published, the de facto practice has arisen for using
this suffix convention for other well-known structuring syntaxes. In
particular, media types have been registered with suffixes such as
"+der", "+fastinfoset", and "+json". This specification formalizes
this practice and sets up a registry for structured type name
suffixes.

Conveniently summarized on Wikipedia (and nowhere else really):

Suffix is an augmentation to the media type definition to additionally specify the underlying structure of that media type, allowing for generic processing based on that structure and independent of the exact type's particular semantics. Media types that make use of a named structured syntax should use the appropriate IANA registered "+"suffix for that structured syntax when they are registered. Unregistered suffixes should not be used (since January 2013). Structured syntax suffix registration procedures are defined in RFC 6838.[15]

The +xml suffix has been defined since January 2001 (RFC 3023[17]), and was formally included in the initial contents of the Structured Syntax Suffix Registry along with +json, +ber, +der, +fastinfoset, +wbxml, and +zip in January 2013 (RFC 6839). Subsequent additions include +gzip, +cbor, +json-seq, and +cbor-seq.[18]

IMO this means the or is fine.


## Data Modeling vs Mapping

JSON Schema operates on an in-memory [data model](https://www.ietf.org/archive/id/draft-bhutton-json-schema-01.html#name-instance-data-model) based on the [JSON RFC](https://www.rfc-editor.org/rfc/rfc8259.html#section-3), which is different from the set of types used by JSON Schema's `type` keyword.

JSON Schema's data model description includes guidance on how to _map_ JSON documents into the data model, such as noting that whitespace and different lexical representations of numbers (such as `1` vs `1.0`) are **not** significant within the data model.

All in-memory data described by the OpenAPI Specification (OAS) uses the same in-memory data model, as described under the "Data Types" section.
However, the OAS defines _mappings_ for several additional media types, where JSON Schema is used on a JSON-like in-memory representation which may have a significantly different structure from the media type's representation in [HTTP content](https://www.rfc-editor.org/rfc/rfc9110.html#name-content).
This registry documents those mappings, and in the future may document additional mappings not explicitly mentioned in the OAS.

### Setting the Media Type

JSON Schema draft 2020-12 offers [keywords for modeling embedded media types](https://www.ietf.org/archive/id/draft-bhutton-json-schema-validation-01.html#name-a-vocabulary-for-the-conten): `contentMediaType`, `contentEncoding`, and `contentSchema`, which can be used to set a media type, encoding, or schema for [certain types of data](https://spec.openapis.org/oas/latest.html#working-with-binary-data). These keywords, most notably `contentMediaType`, can contradict media types set in the parent key of a Media Type Object, or by an Encoding Object (including by the default Encoding Object when an Encoding Object is relevant but not present). In such cases, the Media Type Object key or the Encoding Object ***always*** take precedence over the JSON Schema keywords.

## Specification Versions

This registry is being created for the OpenAPI Specification (OAS) version 3.2, and requirements regarding its support will be included in that specification's text.

Implementations MAY support these data modeling techniques in other OAS versions or other specifications such as Arazzo, as long as the necessary Objects and fields are supported in those specification versions.

## Contributing

While most OpenAPI Initiative registries invite community contributions, this registry is somewhat experimental.
Please open a [Discussion](https://github.com/OAI/OpenAPI-Specification/discussions) explaining your use cases for any media type(s) you would like to see added, rather than proposing a solution.
Solution proposals will be invited _after_ use cases are accepted.

## Media Types

**Note:** For any media type with structured suffix usage (e.g. `application/openapi+json` uses the structured suffix associated with `application/json`), the registered techniques for the media type also apply to media types using the related structured suffix.

|Media Type|Name|Specification|OAS Reference|OAS Versions|
|---|---|---|
{% for value in site.media-type %}| <tt><a href="{{ value.slug }}">{{ value.media_type }}</a></tt> {% if value.unregistered %}_(unregistered)_ {% endif %} | {{ value.description }} | <a href="{{ value.specification.url }}">{{ value.specification.name }}</a> | <a href="https://spec.openapis.org/oas/latest.html#{{ value.reference.anchor }}">{{ value.reference.section }}</a> | {{ value.versions }} |
{% endfor %}