You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Unfortunately, there are a number of devices that are not supported by cupy e.g. I don't believe that my Apple Metal GPU is supported. This means that I must load from zarr via CPU if I would like to use these devices e.g. zarr on disk -> numpy -> torch (which has Metal support).
This is slow(er) and I don't believe is necessary from the zarr specification alone (?).
Background
Multi-device support is a very important requirement in the AI/ML community. I would like to use zarr (and specifically the Python implementation) to run models such as LLMs on multiple devices. The quicker it is to load the model onto device (and with reduced memory usage etc), the better the UX and developer experience is.
Questions
Is cupy the correct/only way to load direct to GPU with zarr-python?
Is there/will there be any way of loading direct to devices such as Metal with zarr-python?
(Related) What is the best way to load a PyTorch neural network on GPU with zarr-python? Is it cupy and then using something like dlpack for zero-copy exchange? Are there alternatives?
CuPy/kvikio relies on nvidia's GPUDirect storage (GDS) driver and goes through PCIe. Metal GPUs are using unified memory, so CPU-to-GPU transfer can in theory be almost zero-cost (passing an address). If there is a way to pass the ownership of an array from CPU to GPU, nothing needs to be done in zarr unless there is need for GPU-accelerated decompression.
In practice though, at least torch implements the to("mps") method by cloning the tensor (memcpy-ish cost), and each ML framework may do different things. Another reference point is jax, which implements (experimental) serialization to zarr using tensorstore.
Problem
I would like to load
zarr
data directly onto non-CPU devices (especially GPU). The current approach appears to rely on usingcupy
to load ontocupy
-supported devices e.g. https://github.com/rapidsai/kvikio/blob/branch-25.02/notebooks/zarr.ipynb.Unfortunately, there are a number of devices that are not supported by
cupy
e.g. I don't believe that my Apple Metal GPU is supported. This means that I must load fromzarr
via CPU if I would like to use these devices e.g.zarr
on disk ->numpy
->torch
(which has Metal support).This is slow(er) and I don't believe is necessary from the
zarr
specification alone (?).Background
Multi-device support is a very important requirement in the AI/ML community. I would like to use
zarr
(and specifically the Python implementation) to run models such as LLMs on multiple devices. The quicker it is to load the model onto device (and with reduced memory usage etc), the better the UX and developer experience is.Questions
cupy
the correct/only way to load direct to GPU withzarr-python
?zarr-python
?zarr-python
? Is itcupy
and then using something like dlpack for zero-copy exchange? Are there alternatives?Related issues
#1967
#2574
cc @jhamman (as suggested by @TomNicholas)
The text was updated successfully, but these errors were encountered: