-
-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update and document GPU buffer handling #2751
base: main
Are you sure you want to change the base?
Conversation
This updates how we handle GPU buffers. See the new docs page for a simple example. The basic idea, as discussed in ..., is to use host buffers for all metadata objects and device buffers for data. Zarr has two types of buffers: plain buffers (used for a stream of bytes) and ndbuffers (used for bytes that represent ndarrays). To make it easier for users, I've added a new config option `zarr.config.enable_gpu()` that can be used to update those both. If we need additional customizations in the future, we can add them here.
Still looking at this, but there is currently a copy to the host for running Zstd (the cc @madsbk. |
This should be ready for review now. This fixes the issue with storing metadata in host memory, rather than device memory. It adds a convenience function for setting up everything to use GPUs. Currently, that just sets the buffer prototype. In the future, I think it could also update the default codecs to use GPU codecs. https://gist.github.com/TomAugspurger/ba13bc29b27f587ae4709ac7f30d89c8 has a snippet hacking together an example of what that future might look like, but IIUC @akshaysubr is working on new APIs for compression / codecs, so IMO it makes sense to address that separately. For now, I've left a note that codecs will run on the CPU by default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I only have a minor suggestion
z[:10, :10] = src[:10, :10] | ||
|
||
result = z[:10, :10] | ||
cp.testing.assert_array_equal(result, src[:10, :10]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cp.testing.assert_array_equal(result, src[:10, :10]) | |
assert isinstance(got, result) | |
cp.testing.assert_array_equal(result, src[:10, :10]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think assert_array_equal
tests object type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct, it doesn't check the type
In [2]: cp.testing.assert_array_equal(np.arange(4), cp.arange(4))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, approving!
This updates how we handle GPU buffers. See the new docs page for a simple example.
The basic idea, as discussed in #2574 is to use host buffers for all metadata objects and device buffers for data.
Zarr has two types of buffers: plain buffers (used for a stream of bytes) and ndbuffers (used for bytes that represent ndarrays). To make it easier for users, I've added a new config option
zarr.config.enable_gpu()
that can be used to update those both. If we need additional customizations in the future (like gpu-accelerated codecs), we can add them here.I've opened this as a draft for now. I want to look a bit more at exactly when data is copied between the host and device. Right now, it looks like the default Zarr v3 codec pipeline will automatically transfer bytes to the host in
ZstdCodec._encode_single
:zarr-python/src/zarr/codecs/zstd.py
Line 88 in 0c154c3
This PR ensures that you end up with GPU bytes when reading data, but all data I/O and encoding / decoding still happens on the CPU. I think in the future we'll be able to do more work on the GPU while avoiding copies back to the host.
TODO:
docs/user-guide/*.rst
changes/
Clsoes #2574