Skip to content

Commit d8b842f

Browse files
committed
review docs; sure-up blosc refactor
1 parent 30ba89f commit d8b842f

20 files changed

+567
-442
lines changed

README.rst

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
1-
numcodecs
1+
Numcodecs
22
=========
33

4-
TODO
4+
Numcodecs is a Python package providing buffer compression and transformation
5+
codecs for use in data storage and communication applications.
56

67
.. image:: https://travis-ci.org/alimanfoo/numcodecs.svg?branch=master
78
:target: https://travis-ci.org/alimanfoo/numcodecs

adhoc/blosc_memleak_check.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# -*- coding: utf-8 -*-
2+
from __future__ import absolute_import, print_function, division
3+
import sys
4+
5+
6+
import numcodecs as codecs
7+
from numcodecs import blosc
8+
import numpy as np
9+
from numpy.testing import assert_array_equal
10+
11+
12+
codec = codecs.Blosc()
13+
data = np.arange(int(sys.argv[1]))
14+
for i in range(int(sys.argv[2])):
15+
enc = codec.encode(data)
16+
dec = codec.decode(enc)
17+
arr = np.frombuffer(dec, dtype=data.dtype)
18+
assert_array_equal(data, arr)

docs/abc.rst

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,10 @@
1-
Codec
2-
=====
3-
.. module:: numcodecs.abc
1+
Codec API
2+
=========
3+
.. automodule:: numcodecs.abc
44

55
.. autoclass:: Codec
66

7+
.. autoattribute:: codec_id
78
.. automethod:: encode
89
.. automethod:: decode
910
.. automethod:: get_config

docs/blosc.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
Blosc
22
=====
3-
.. module:: numcodecs.blosc
3+
.. automodule:: numcodecs.blosc
44

55
.. autoclass:: Blosc
66

docs/bz2.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
BZ2
22
===
3-
.. module:: numcodecs.bz2
3+
.. automodule:: numcodecs.bz2
44

55
.. autoclass:: BZ2

docs/categorize.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
Categorize
22
==========
3-
.. module:: numcodecs.categorize
3+
.. automodule:: numcodecs.categorize
44

55
.. autoclass:: Categorize

docs/checksum32.rst

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,11 @@
11
32-bit checksums
22
================
3-
.. module:: numcodecs.checksum32
3+
.. automodule:: numcodecs.checksum32
44

5+
CRC32
6+
-----
57
.. autoclass:: CRC32
8+
9+
Adler32
10+
-------
611
.. autoclass:: Adler32

docs/delta.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
Delta
22
=====
3-
.. module:: numcodecs.delta
3+
.. automodule:: numcodecs.delta
44

55
.. autoclass:: Delta

docs/fixedscaleoffset.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
FixedScaleOffset
22
================
3-
.. module:: numcodecs.fixedscaleoffset
3+
.. automodule:: numcodecs.fixedscaleoffset
44

55
.. autoclass:: FixedScaleOffset

docs/index.rst

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,11 @@
11
.. numcodecs documentation master file, created by
22
sphinx-quickstart on Mon May 2 21:40:09 2016.
33
4+
45
Numcodecs
56
=========
67

7-
This package contains compression and other codecs intended primarily for use
8-
with numerical data.
9-
10-
If you have a question, find a bug, would like to make a suggestion or
11-
contribute code, please `raise an issue on GitHub
12-
<https://github.com/alimanfoo/numcodecs/issues>`_.
8+
.. automodule:: numcodecs
139

1410
Installation
1511
------------
@@ -62,6 +58,7 @@ Contents
6258
packbits
6359
categorize
6460
checksum32
61+
registry
6562
release
6663

6764
Acknowledgments

docs/lzma.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
LZMA
22
====
3-
.. module:: numcodecs.lzma
3+
.. automodule:: numcodecs.lzma
44

55
.. autoclass:: LZMA

docs/packbits.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
PackBits
22
========
3-
.. module:: numcodecs.packbits
3+
.. automodule:: numcodecs.packbits
44

55
.. autoclass:: PackBits

docs/registry.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
Codec registry
2+
==============
3+
.. automodule:: numcodecs.registry
4+
5+
.. autofunction:: get_codec
6+
.. autofunction:: register_codec

docs/release.rst

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,13 @@ Release notes
66
0.0.0
77
-----
88

9-
First release.
9+
First release. This version is a port of the ``codecs`` module from `Zarr
10+
<http://zarr.readthedocs.io>`_ 2.1.0. The following changes have been made from
11+
the original Zarr module:
12+
13+
* Codec classes have been re-organized into separate modules, mostly one per
14+
codec class, for ease of maintenance.
15+
* Two new codec classes have been added based on 32-bit checksums: CRC32 and
16+
Adler32.
17+
* The Blosc extension has been refactored to remove code duplications related
18+
to handling of buffer compatibility.

docs/zlib.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
Zlib
22
====
3-
.. module:: numcodecs.zlib
3+
.. automodule:: numcodecs.zlib
44

55
.. autoclass:: Zlib

numcodecs/__init__.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,23 @@
11
# -*- coding: utf-8 -*-
22
# flake8: noqa
3+
"""Numcodecs is a Python package providing buffer compression and
4+
transformation codecs for use in data storage and communication
5+
applications. These include:
6+
7+
* Compression codecs, e.g., Zlib, BZ2, LZMA and Blosc.
8+
* Pre-compression filters, e.g., Delta, Quantize, FixedScaleOffset,
9+
PackBits, Categorize.
10+
* Integrity checks, e.g., CRC32, Adler32.
11+
12+
All codecs implement the same API, allowing codecs to be organized into
13+
pipelines in a variety of ways.
14+
15+
If you have a question, find a bug, would like to make a suggestion or
16+
contribute code, please `raise an issue on GitHub
17+
<https://github.com/alimanfoo/numcodecs/issues>`_.
18+
19+
"""
20+
321
from __future__ import absolute_import, print_function, division
422
import multiprocessing
523
import atexit

numcodecs/abc.py

Lines changed: 53 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,34 @@
11
# -*- coding: utf-8 -*-
2+
"""This module defines the :class:`Codec` base class, a common interface for
3+
all codec classes.
4+
5+
Codec classes must implement :func:`Codec.encode` and :func:`Codec.decode`
6+
methods. Inputs to and outputs from these methods may be any Python object
7+
exporting a contiguous buffer via the new-style Python protocol
8+
or :class:`array.array` under Python 2.
9+
10+
Codec classes must implement a :func:`Codec.get_config` method,
11+
which must return a dictionary holding all configuration parameters
12+
required to enable encoding and decoding of data. The expectation is that
13+
these configuration parameters will be stored or communicated separately
14+
from encoded data, and thus the codecs do not need to store all encoding
15+
parameters within the encoded data. For broad compatibility,
16+
the configuration object must contain only JSON-serializable values. The
17+
configuration object must also contain an 'id' field storing the codec
18+
identifier (see below).
19+
20+
Codec classes must implement a :func:`Codec.from_config` class method,
21+
which will return an instance of the class initiliazed from a configuration
22+
object.
23+
24+
Finally, codec classes must set a `codec_id` class-level attribute. This
25+
must be a string. Two different codec classes may set the same value for the
26+
`codec_id` attribute if and only if they are fully compatible, meaning that
27+
(1) configuration parameters are the same, and (2) given the same
28+
configuration, one class could correctly decode data encoded by the
29+
other and vice versa.
30+
31+
"""
232
from __future__ import absolute_import, print_function, division
333

434

@@ -7,6 +37,7 @@ class Codec(object): # pragma: no cover
737

838
# override in sub-class
939
codec_id = None
40+
"""Codec identifier."""
1041

1142
def encode(self, buf):
1243
"""Encode data in `buf`.
@@ -15,15 +46,13 @@ def encode(self, buf):
1546
----------
1647
buf : buffer-like
1748
Data to be encoded. May be any object supporting the new-style
18-
buffer protocol or `array.array` (only supports old-style buffer
19-
protocol in PY2).
49+
buffer protocol or `array.array` under Python 2.
2050
2151
Returns
2252
-------
2353
enc : buffer-like
2454
Encoded data. May be any object supporting the new-style buffer
25-
protocol or `array.array` (only supports old-style buffer
26-
protocol in PY2).
55+
protocol or `array.array` under Python 2.
2756
2857
"""
2958
# override in sub-class
@@ -36,40 +65,48 @@ def decode(self, buf, out=None):
3665
----------
3766
buf : buffer-like
3867
Encoded data. May be any object supporting the new-style buffer
39-
protocol or `array.array` (only supports old-style buffer
40-
protocol in PY2).
68+
protocol or `array.array` under Python 2.
4169
out : buffer-like, optional
4270
Writeable buffer to store decoded data.
4371
4472
Returns
4573
-------
4674
dec : buffer-like
4775
Decoded data. May be any object supporting the new-style
48-
buffer protocol or `array.array` (only supports old-style buffer
49-
protocol in PY2).
76+
buffer protocol or `array.array` under Python 2.
5077
5178
"""
5279
# override in sub-class
5380
raise NotImplementedError
5481

5582
def get_config(self):
5683
"""Return a dictionary holding configuration parameters for this
57-
codec. Must include an 'id' field with the codec ID. All values must be
58-
compatible with JSON encoding."""
84+
codec. Must include an 'id' field with the codec identifier. All
85+
values must be compatible with JSON encoding."""
86+
5987
# override in sub-class if need special encoding of config values
88+
89+
# setup config object
6090
config = dict(id=self.codec_id)
61-
# add in all non-private members
91+
92+
# by default, assume all non-private members are configuration
93+
# parameters - override this in sub-class if not the case
6294
for k in self.__dict__:
6395
if not k.startswith('_'):
6496
config[k] = getattr(self, k)
97+
6598
return config
6699

67100
@classmethod
68101
def from_config(cls, config):
69102
"""Instantiate codec from a configuration object."""
70103
# N.B., assume at this point the 'id' field has been removed from
71104
# the config object
105+
72106
# override in sub-class if need special decoding of config values
107+
108+
# by default, assume constructor accepts configuration parameters as
109+
# keyword arguments without any special decoding
73110
return cls(**config)
74111

75112
def __eq__(self, other):
@@ -80,9 +117,13 @@ def __eq__(self, other):
80117
return False
81118

82119
def __repr__(self):
120+
83121
# override in sub-class if need special representation
122+
123+
# by default, assume all non-private members are configuration
124+
# parameters and valid keyword arguments to constructor function
125+
84126
r = '%s(' % type(self).__name__
85-
# by default, include all non-private members
86127
params = ['%s=%r' % (k, getattr(self, k))
87128
for k in sorted(self.__dict__)
88129
if not k.startswith('_')]

0 commit comments

Comments
 (0)