Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add specification for the __binsparse__ protocol #912

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -35,3 +35,7 @@ tmp/
*.egg
dist/
.DS_STORE

# pixi environments
.pixi
*.egg-info
33 changes: 28 additions & 5 deletions spec/draft/design_topics/data_interchange.rst
Original file line number Diff line number Diff line change
Expand Up @@ -85,17 +85,40 @@ page gives a high-level specification for data exchange in Python using DLPack.
below. They are not required to return an array object from ``from_dlpack``
which conforms to this standard.

binsparse: Extending to sparse arrays
-------------------------------------

Sparse arrays can be represented in-memory by a collection of 1-dimensional and 2-dimensional
dense arrays, alongside some metadata on how to interpret these arrays. This allows us to re-use
the DLPack protocol for the storage of the constituent arrays. The work of specifying the
accompanying metadata has already been performed by the
`binsparse specification <https://graphblas.org/binsparse-specification/>`_.

While initially intended to target file formats, binsparse has relatively few requirements from
back-ends:

1. The ability to represent and parse JSON.
2. To be able to represent/store a key-value store of 1-dimensional (and optionally 2-dimensional)
arrays.

It is the only such specification for sparse representations to have these minimal requirements.
We can satisfy both: The former with the ``json`` built-in Python module or a Python ``dict`` and
the latter with the DLPack protocol.

.. note::
See the `RFC to adopt binsparse <https://github.com/data-apis/array-api/issues/840>`_
for discussion that preceded the adoption of the binsparse protocol.

See :ref:`sparse_interchange` for the Python specification of this protocol.


Non-supported use cases
-----------------------

Use of DLPack requires that the data can be represented by a strided, in-memory
layout on a single device. This covers usage by a large range of, but not all,
known and possible array libraries. Use cases that are not supported by DLPack
include:

- Distributed arrays, i.e., the data residing on multiple nodes or devices,
- Sparse arrays, i.e., sparse representations where a data value (typically
zero) is implicit.
include distributed arrays, i.e., the data residing on multiple nodes or devices.

There may be other reasons why it is not possible or desirable for an
implementation to materialize the array as strided data in memory. In such
Expand Down
1 change: 1 addition & 0 deletions spec/draft/extensions/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,4 @@ the array API standard. See :ref:`api-specification`.

fourier_transform_functions
linear_algebra_functions
sparse_interchange
39 changes: 39 additions & 0 deletions spec/draft/extensions/sparse_interchange.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
.. _sparse_interchange:

Sparse interchange
==================

Array API specification for sparse interchange functions.

Extension name and usage
------------------------

If implemented, this extension must be retrievable via::

>>> if hasattr(x, '__dlpack__'):
>>> # Use the extension

Objects in API
--------------

A conforming implementation of this extension must provide and support the following
functions/methods.

.. currentmodule:: array_api

..
NOTE: please keep the functions and their inverse together

.. autosummary::
:toctree: generated
:template: method.rst

from_binsparse


.. autosummary::
:toctree: generated
:template: property.rst

array.__binsparse__
array.__binsparse_descriptor__
36 changes: 36 additions & 0 deletions src/array_api_stubs/_draft/array_object.py
Original file line number Diff line number Diff line change
Expand Up @@ -1246,5 +1246,41 @@ def to_device(
Clarified behavior when a provided ``device`` object corresponds to the device on which an array instance resides.
"""

def __binsparse_descriptor__(self) -> dict:
"""
Returns a `dict` equivalent to a parsed `binsparse JSON descriptor <https://graphblas.org/binsparse-specification/>`_.

Parameters
----------
self: array
array instance.

Returns
-------
out: dict
A ``dict`` equivalent to a parsed JSON binsparse descriptor of an array. See :ref:`sparse_interchange` for details.


.. versionadded:: 2025.12
"""

def __binsparse__(self) -> dict[str, array]:
"""
Returns a key-value store of the constituent arrays of a sparse array, as specified by the `binsparse specification <https://graphblas.org/binsparse-specification/>`_.

Parameters
----------
self: array
array instance.

Returns
-------
out: dict[str, array]
A ``dict`` equivalent to a parsed JSON binsparse descriptor of an array. See :ref:`sparse_interchange` for details.


.. versionadded:: 2025.12
"""


array = _array
27 changes: 27 additions & 0 deletions src/array_api_stubs/_draft/creation_functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
"empty_like",
"eye",
"from_dlpack",
"from_binsparse",
"full",
"full_like",
"linspace",
Expand Down Expand Up @@ -645,3 +646,29 @@ def zeros_like(
out: array
an array having the same shape as ``x`` and filled with zeros.
"""


def from_binsparse(arrays: dict[str, array], descriptor: dict, /) -> array:
"""
Returns a new array containing the data from another (array) object with a ``__binsparse__`` method.

Parameters
----------
arrays: dict[str, array]
input constituent arrays.
descriptor: dict
The parsed binsparse descriptor of the array.

Returns
-------
out: array
an array containing the data in `arrays` with a format specified by `descriptor`.

.. admonition:: Note
:class: note

The returned array may be either a copy or a view. See :ref:`data-interchange` for details.


.. versionadded:: 2025.12
"""