-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add zstd codec #256
base: main
Are you sure you want to change the base?
Add zstd codec #256
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
========================== | ||
Zstd codec (version 1.0) | ||
========================== | ||
|
||
**Editor's draft 26 July 2019** | ||
|
||
Specification URI: | ||
https://zarr-specs.readthedocs.io/en/latest/v3/codecs/zstd/v1.0.html | ||
Corresponding ZEP: | ||
`ZEP 1 — Zarr specification version 3 <https://zarr.dev/zeps/draft/ZEP0001.html>`_ | ||
Issue tracking: | ||
`GitHub issues <https://github.com/zarr-developers/zarr-specs/labels/codec>`_ | ||
Suggest an edit for this spec: | ||
`GitHub editor <https://github.com/zarr-developers/zarr-specs/blob/main/docs/v3/codecs/zstd/v1.0.rst>`_ | ||
|
||
Copyright 2020 `Zarr core development team | ||
<https://github.com/orgs/zarr-developers/teams/core-devs>`_. This work | ||
is licensed under a `Creative Commons Attribution 3.0 Unported License | ||
<https://creativecommons.org/licenses/by/3.0/>`_. | ||
|
||
---- | ||
|
||
|
||
Abstract | ||
======== | ||
|
||
Defines a ``bytes -> bytes`` codec that applies zstd compression. | ||
|
||
|
||
Status of this document | ||
======================= | ||
|
||
.. warning:: | ||
This document is a draft for review and subject to changes. | ||
It will become final when the `Zarr Enhancement Proposal (ZEP) 1 <https://zarr.dev/zeps/draft/ZEP0001.html>`_ | ||
is approved via the `ZEP process <https://zarr.dev/zeps/active/ZEP0000.html>`_. | ||
|
||
|
||
Document conventions | ||
==================== | ||
|
||
Conformance requirements are expressed with a combination of | ||
descriptive assertions and [RFC2119]_ terminology. The key words | ||
"MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", | ||
"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in the normative | ||
parts of this document are to be interpreted as described in | ||
[RFC2119]_. However, for readability, these words do not appear in all | ||
uppercase letters in this specification. | ||
|
||
All of the text of this specification is normative except sections | ||
explicitly marked as non-normative, examples, and notes. Examples in | ||
this specification are introduced with the words "for example". | ||
|
||
|
||
Codec name | ||
========== | ||
|
||
The value of the ``name`` member in the codec object MUST be ``zstd``. | ||
|
||
|
||
Configuration parameters | ||
======================== | ||
|
||
level: | ||
An integer from -131072 to 22 which controls the speed and level | ||
of compression (has no impact on decoding). A value of 0 indicates to use | ||
the default compression level. Otherwise, a higher level is expected to | ||
achieve a higher compression ratio at the cost of lower speed. | ||
|
||
checksum: | ||
A boolean that indicates whether to store a checksum when writing that will | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Trying to implement that from scratch, it wasn't immediately obvious to me what that checksum was, whether this was some CRC manually appended or something really belonging to libzstd. |
||
be verified when reading. | ||
|
||
For example, the array metadata below specifies that the compressor is the Zstd | ||
codec configured with a compression level of 1 and with the checksum stored:: | ||
|
||
{ | ||
"codecs": [{ | ||
"name": "zstd", | ||
"configuration": { | ||
"level": 1, | ||
"checksum": true | ||
} | ||
}], | ||
} | ||
|
||
|
||
Format and algorithm | ||
==================== | ||
|
||
This is a ``bytes -> bytes`` codec. | ||
|
||
Encoded data should conform to the Zstandard file format [RFC8878]_. | ||
|
||
References | ||
========== | ||
|
||
.. [RFC2119] S. Bradner. Key words for use in RFCs to Indicate | ||
Requirement Levels. March 1997. Best Current Practice. URL: | ||
https://tools.ietf.org/html/rfc2119 | ||
|
||
.. [RFC8878] Y. Collet. Zstandard Compression and the | ||
'application/zstd' Media Type. Februrary 2021. Informational. URL: | ||
https://tools.ietf.org/html/rfc8878 | ||
|
||
Change log | ||
========== | ||
|
||
No changes yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these levels should be hard coded as they may change in future libzstd versions. libzstd will clamp out-of-range compression level values to the range it supports, so any
int
value of level should be accepted to improve forward compatibility.