Skip to content

Slow single chunk download using s3fs storage option. #2662

Open
@JoshCu

Description

@JoshCu

When using s3fs with zarr xarray and dask, I found that if the data subset being fetched was all within a single chunk then the download was single threaded (as far as I can tell) and as a result quite slow.
For larger subsets spanning multiple chunks this isn't an issue as it quickly becomes network bandwidth limited. With the exception of the end of a download where as threads finish downloading, you're stuck waiting for the last few threads to finish up with network bandwidth to spare.

The s3fs cat_file function does take range input parameters but as far as I could tell they weren't being used.
For my use case, I overwrote the s3fs function to always check the size of the file being downloaded and chunk the download with range requests if it was over a certain size. Even with this additional logic slowing down all other requests for metadata etc, it cut down the ~5 minute download to 20s.

I initialized a mfdataset like this
link

def load_zarr_datasets(forcing_vars: list[str] = None) -> xr.Dataset:
    if not forcing_vars:
        forcing_vars = ["lwdown", "precip", "psfc", "q2d", "swdown", "t2d", "u2d", "v2d"]
    # if a LocalCluster is not already running, start one
    try:
        client = Client.current()
    except ValueError:
        cluster = LocalCluster()
        client = Client(cluster)
    s3_urls = [
        f"s3://noaa-nwm-retrospective-3-0-pds/CONUS/zarr/forcing/{var}.zarr"
        for var in forcing_vars
    ]
    # default cache is readahead which is detrimental to performance in this case
    fs = S3ParallelFileSystem(anon=True, default_cache_type="none")  # default_block_size
    s3_stores = [s3fs.S3Map(url, s3=fs) for url in s3_urls]
    dataset = xr.open_mfdataset(s3_stores, parallel=True, engine="zarr", cache=True)
    return dataset

and save it like this

client = Client.current()
future = client.compute(dataset.to_netcdf(temp_path, compute=False))
# Display progress bar
progress(future)
future.result()

This may just be a workaround for the real solution of rechunking the data into smaller pieces, but I'm not really able to do that. At the very least I hope this is useful to save someone else some time if they come across the same issue.

Happy to expand on this with more info, tests, and examples if it's useful?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions