From 68a7d562fa9f63733d1178120d2c43ce0f168b26 Mon Sep 17 00:00:00 2001 From: OPMTerra Date: Wed, 5 Mar 2025 02:44:18 +0530 Subject: [PATCH 1/6] DOC: Add data= examples and async note (fix #2809) --- docs/user-guide/arrays.rst | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/docs/user-guide/arrays.rst b/docs/user-guide/arrays.rst index a62b2ea0fa..c23764d0e0 100644 --- a/docs/user-guide/arrays.rst +++ b/docs/user-guide/arrays.rst @@ -30,6 +30,29 @@ module documentation. .. _user-guide-array: +Initializing with Data +--------------------- +Pass existing data during array creation for better performance: + +.. code-block:: python + + import zarr + import numpy as np + + data = np.random.rand(1000, 1000) + arr = zarr.create_array("data.zarr", shape=data.shape, data=data) + +This avoids a separate write step and is more efficient than: + +.. code-block:: python + + # Less efficient approach + arr = zarr.create_array("data.zarr", shape=(1000, 1000)) + arr[:] = data + +.. note:: + For cloud stores like S3, use ``data=`` with async writes for best performance. + Reading and writing data ------------------------ From f937101209e1daa9cc76ed642e26ee7c5744906d Mon Sep 17 00:00:00 2001 From: OPMTerra Date: Wed, 5 Mar 2025 03:24:07 +0530 Subject: [PATCH 2/6] DOC: Clarify data= performance per review feedback --- docs/user-guide/arrays.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/user-guide/arrays.rst b/docs/user-guide/arrays.rst index c23764d0e0..7e74bbe5bb 100644 --- a/docs/user-guide/arrays.rst +++ b/docs/user-guide/arrays.rst @@ -31,7 +31,7 @@ module documentation. .. _user-guide-array: Initializing with Data ---------------------- +---------------------- Pass existing data during array creation for better performance: .. code-block:: python @@ -42,16 +42,16 @@ Pass existing data during array creation for better performance: data = np.random.rand(1000, 1000) arr = zarr.create_array("data.zarr", shape=data.shape, data=data) -This avoids a separate write step and is more efficient than: +This provides a more concise syntax compared to separate assignment: .. code-block:: python - # Less efficient approach + # Equivalent but verbose arr = zarr.create_array("data.zarr", shape=(1000, 1000)) arr[:] = data .. note:: - For cloud stores like S3, use ``data=`` with async writes for best performance. + Both methods leverage async writes when using async-compatible stores like S3. Reading and writing data ------------------------ From 2d5f64a3a116f12954c48a4a633ee5cf86e5e80e Mon Sep 17 00:00:00 2001 From: OPMTerra Date: Wed, 5 Mar 2025 21:49:49 +0530 Subject: [PATCH 3/6] DOC: Remove performance/async claims per review feedback --- docs/user-guide/arrays.rst | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/docs/user-guide/arrays.rst b/docs/user-guide/arrays.rst index 7e74bbe5bb..bb4fe87aa3 100644 --- a/docs/user-guide/arrays.rst +++ b/docs/user-guide/arrays.rst @@ -32,7 +32,7 @@ module documentation. Initializing with Data ---------------------- -Pass existing data during array creation for better performance: +Pass existing data during array creation for convenience: .. code-block:: python @@ -42,17 +42,12 @@ Pass existing data during array creation for better performance: data = np.random.rand(1000, 1000) arr = zarr.create_array("data.zarr", shape=data.shape, data=data) -This provides a more concise syntax compared to separate assignment: - +Equivalent to separate assignment: .. code-block:: python - # Equivalent but verbose arr = zarr.create_array("data.zarr", shape=(1000, 1000)) arr[:] = data -.. note:: - Both methods leverage async writes when using async-compatible stores like S3. - Reading and writing data ------------------------ From 9a869a161e86cb982c10ff7eb2b32d89729bfb25 Mon Sep 17 00:00:00 2001 From: OPMTerra Date: Wed, 5 Mar 2025 21:57:30 +0530 Subject: [PATCH 4/6] DOC: Fix code block formatting in arrays.rst --- docs/user-guide/arrays.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/user-guide/arrays.rst b/docs/user-guide/arrays.rst index bb4fe87aa3..57090c3156 100644 --- a/docs/user-guide/arrays.rst +++ b/docs/user-guide/arrays.rst @@ -43,6 +43,7 @@ Pass existing data during array creation for convenience: arr = zarr.create_array("data.zarr", shape=data.shape, data=data) Equivalent to separate assignment: + .. code-block:: python arr = zarr.create_array("data.zarr", shape=(1000, 1000)) From fd7a5a6eb7b6b6a463ca1e028f10e53d705dd9f4 Mon Sep 17 00:00:00 2001 From: OPMTerra <144823952+OPMTerra@users.noreply.github.com> Date: Tue, 18 Mar 2025 21:08:29 +0530 Subject: [PATCH 5/6] Update docs/user-guide/arrays.rst Co-authored-by: Joe Hamman --- docs/user-guide/arrays.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/user-guide/arrays.rst b/docs/user-guide/arrays.rst index 57090c3156..1809cbbcaa 100644 --- a/docs/user-guide/arrays.rst +++ b/docs/user-guide/arrays.rst @@ -30,7 +30,7 @@ module documentation. .. _user-guide-array: -Initializing with Data +Initializing with data ---------------------- Pass existing data during array creation for convenience: From ee16cf36d6e651dce6deecc03113714b58acaf6a Mon Sep 17 00:00:00 2001 From: OPMTerra <144823952+OPMTerra@users.noreply.github.com> Date: Tue, 18 Mar 2025 21:13:17 +0530 Subject: [PATCH 6/6] Update docs/user-guide/arrays.rst Shape inferred from data Co-authored-by: Joe Hamman --- docs/user-guide/arrays.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/user-guide/arrays.rst b/docs/user-guide/arrays.rst index 1809cbbcaa..4952665e5c 100644 --- a/docs/user-guide/arrays.rst +++ b/docs/user-guide/arrays.rst @@ -40,7 +40,7 @@ Pass existing data during array creation for convenience: import numpy as np data = np.random.rand(1000, 1000) - arr = zarr.create_array("data.zarr", shape=data.shape, data=data) + arr = zarr.create_array("data.zarr", data=data) Equivalent to separate assignment: