Skip to content

Concurrent data downloading and appending to zarr store #2532

Answered by guidocioni
guidocioni asked this question in Q&A
Discussion options

You must be logged in to vote

Cannot believe it was THAT easy.
Thanks to the suggestions of @rabernat I was able to create the following example which does exactly what it's supposed to do by only writing the region specific to a certain time (which is automatically detected by xarray).
The result is that I'm able to download and process about 20 years of daily data in about 7 minutes, and this could be improved by using even more workers (but then the network will probably become the bottleneck).
I'm leaving this here in case someone has the same problem.

import xarray as xr
import fsspec
import gzip
import io
import pyproj
import pandas as pd
from tqdm.contrib.concurrent import process_map
import dask.array as da
im…

Replies: 3 comments 4 replies

Comment options

You must be logged in to vote
1 reply
@guidocioni
Comment options

Comment options

You must be logged in to vote
0 replies
Answer selected by guidocioni
Comment options

You must be logged in to vote
3 replies
@rabernat
Comment options

@guidocioni
Comment options

@rabernat
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants