Issue with parallelization in sea ice notebook? #186

mnlevy1981 · 2025-02-07T18:12:07Z

Describe the bug
@dabail10 has been reporting occasional RuntimeError: NetCDF: Not a valid ID errors when running a notebook in parallel. I found a two-year-old discussion from Australia, but their fix was to run with a single thread per worker and we already set threads_per_worker=1 when creating the LocalCluster. This is verified in the client object, where there are 16 processes and 16 threads:

<Client: 'tcp://127.0.0.1:39305' processes=16 threads=16, memory=120.00 GiB>

To Reproduce
On casper or derecho, run cupid-diagnostics --ice on a compute node with multiple cores; sometimes the Hemis_seaice_visual_compare_contour.ipynb fails.

Expected behavior
This error should not crop up.

Additional context
There's additional discussion at pydata/xarray#7079 (I found that issue because it is mentioned in the Australian discussion), but it all seems to claim the issue is limited to threads_per_worker>1 so maybe it's not related to Dave's problem?

Running cupid-diagnostics --ice --serial will avoid the issue, but it will also run slower because it won't be parallelized.

The text was updated successfully, but these errors were encountered:

TeaganKing added bug Something isn't working cice labels Feb 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with parallelization in sea ice notebook? #186

Issue with parallelization in sea ice notebook? #186

mnlevy1981 commented Feb 7, 2025

Issue with parallelization in sea ice notebook? #186

Issue with parallelization in sea ice notebook? #186

Comments

mnlevy1981 commented Feb 7, 2025