You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug @dabail10 has been reporting occasional RuntimeError: NetCDF: Not a valid ID errors when running a notebook in parallel. I found a two-year-old discussion from Australia, but their fix was to run with a single thread per worker and we already set threads_per_worker=1 when creating the LocalCluster. This is verified in the client object, where there are 16 processes and 16 threads:
To Reproduce
On casper or derecho, run cupid-diagnostics --ice on a compute node with multiple cores; sometimes the Hemis_seaice_visual_compare_contour.ipynb fails.
Expected behavior
This error should not crop up.
Additional context
There's additional discussion at pydata/xarray#7079 (I found that issue because it is mentioned in the Australian discussion), but it all seems to claim the issue is limited to threads_per_worker>1 so maybe it's not related to Dave's problem?
Running cupid-diagnostics --ice --serial will avoid the issue, but it will also run slower because it won't be parallelized.
The text was updated successfully, but these errors were encountered:
Describe the bug
@dabail10 has been reporting occasional
RuntimeError: NetCDF: Not a valid ID
errors when running a notebook in parallel. I found a two-year-old discussion from Australia, but their fix was to run with a single thread per worker and we already setthreads_per_worker=1
when creating theLocalCluster
. This is verified in theclient
object, where there are 16 processes and 16 threads:To Reproduce
On
casper
orderecho
, runcupid-diagnostics --ice
on a compute node with multiple cores; sometimes theHemis_seaice_visual_compare_contour.ipynb
fails.Expected behavior
This error should not crop up.
Additional context
There's additional discussion at pydata/xarray#7079 (I found that issue because it is mentioned in the Australian discussion), but it all seems to claim the issue is limited to
threads_per_worker>1
so maybe it's not related to Dave's problem?Running
cupid-diagnostics --ice --serial
will avoid the issue, but it will also run slower because it won't be parallelized.The text was updated successfully, but these errors were encountered: