Skip to content

Replace PBC handling in feature detection with label function from dask-image#562

Draft
w-k-jones wants to merge 7 commits intotobac-project:RC_v1.6.xfrom
w-k-jones:dask_image_label
Draft

Replace PBC handling in feature detection with label function from dask-image#562
w-k-jones wants to merge 7 commits intotobac-project:RC_v1.6.xfrom
w-k-jones:dask_image_label

Conversation

@w-k-jones
Copy link
Copy Markdown
Member

The dask_image.ndmeasure.label function replicates the scipy.ndimage.label function, but adds a wrap_axes keyword that can be used to perform label detection across periodic boundaries (it requires identical logic to labelling regions across tiles). Replacing the current feature detection PBC code with this both simplifies our codebase, and starts to integrate more dask features into the tobac pipeline so that eventually it can be fully dask enabled. I have set up the label function to replicate the handling of connectivity indentically to the scikit-image function it is replacing, meaning that full connectivity is used rather than square connectivity. However, in future I think we should make this an option and change the default to square (1) connectivity to match segmentation (see #481)

Currently in draft as I am not sure whether to introduce this on its own or as part of a larger refactor of feature detection

  • Have you followed our guidelines in CONTRIBUTING.md?
  • Have you self-reviewed your code and corrected any misspellings?
  • Have you written documentation that is easy to understand?
  • Have you written descriptive commit messages?
  • Have you added NumPy docstrings for newly added functions?
  • Have you formatted your code using black?
  • If you have introduced a new functionality, have you added adequate unit tests?
  • Have all tests passed in your local clone?
  • If you have introduced a new functionality, have you added an example notebook?
  • Have you kept your pull request small and limited so that it is easy to review?
  • Have the newest changes from this branch been merged?

@github-actions
Copy link
Copy Markdown

github-actions bot commented Feb 5, 2026

Linting results by Pylint:

Your code has been rated at 8.36/10 (previous run: 8.36/10, +0.00)
The linting score is an indicator that reflects how well your code version follows Pylint’s coding standards and quality metrics with respect to the RC_v1.6.x branch.
A decrease usually indicates your new code does not fully meet style guidelines or has potential errors.

@w-k-jones w-k-jones changed the base branch from main to RC_v1.6.x February 6, 2026 07:03
@codecov
Copy link
Copy Markdown

codecov bot commented Feb 6, 2026

Codecov Report

❌ Patch coverage is 85.71429% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.46%. Comparing base (fcb7cd3) to head (a4e6957).
⚠️ Report is 12 commits behind head on RC_v1.6.x.

Files with missing lines Patch % Lines
tobac/feature_detection.py 85.71% 2 Missing ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##           RC_v1.6.x     #562      +/-   ##
=============================================
- Coverage      64.84%   64.46%   -0.38%     
=============================================
  Files             27       27              
  Lines           3985     3923      -62     
=============================================
- Hits            2584     2529      -55     
+ Misses          1401     1394       -7     
Flag Coverage Δ
unittests 64.46% <85.71%> (-0.38%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@w-k-jones
Copy link
Copy Markdown
Member Author

Testing with the low cloud tracking notebook shows a slight slow down in performance:

Previous: 3.43 s ± 27.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
New: 3.61 s ± 36.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

I will test on a larger example, but I think this is a reasonable trade off for simpler code

@w-k-jones
Copy link
Copy Markdown
Member Author

On further investigation with larger datasets, it seems like there is a fairly large overhead to using the dask image label function for moderately large feature detection jobs (where feature detection takes ~10 seconds to 1 minute), but performance benefits for larger tasks (e.g. multiple minutes). I will investigate a bit more to see if time steps vs spatial array size has different results. Note that this is using the default dask settings, so not increasing the number of workers etc. or chunking the data.

@w-k-jones
Copy link
Copy Markdown
Member Author

Some test results:

MCSMIP Obs (geo-ir), 1200x3600 domain, PBC hdim_2: [on HPC]
24 time steps: previous: 26.92 s, new: 48.26 s
240 time steps: previous: 04:41.61, new: 04:12.25
672 time steps: previous: 0:11:17.83, new: 09:29.44

EUREC4A LES, 1524x1524 domain, PBC both: [on laptop]
13 time steps: previous: 1.04 s ± 6.93 ms per loop, new: 1.63 s ± 20.5 ms per loop
49 time steps: previous: 3.66 s ± 27.1 ms per loop, new: 5.84 s ± 71.5 ms per loop
720 time steps: previous: 41.8 s ± 528 ms per loop, new: 1min 22s ± 741 ms per loop

What's puzzling is that the relative performance is changing with the number of timesteps, even though the labelling is working independently per timestep. Dask continues to be an enigma to me, do you have any ideas on this @freemansw1 ?

@freemansw1
Copy link
Copy Markdown
Member

@w-k-jones I'm certainly not a dask expert (and I'm looking at this relatively quickly), but I'm guessing it has to do with how the dask graph is constructed/how big the dask graph gets. Even if it's operating independently per timestep, unless compute is being run after every timestep, it only (in theory) does it at the end. I've also found that sometimes if you do run compute every timestep, it recomputes old results. I'm not sure why it does that, though.

This is very exciting, though. I wonder how performance scales by number of workers.

@w-k-jones
Copy link
Copy Markdown
Member Author

@w-k-jones I'm certainly not a dask expert (and I'm looking at this relatively quickly), but I'm guessing it has to do with how the dask graph is constructed/how big the dask graph gets. Even if it's operating independently per timestep, unless compute is being run after every timestep, it only (in theory) does it at the end. I've also found that sometimes if you do run compute every timestep, it recomputes old results. I'm not sure why it does that, though.

This is very exciting, though. I wonder how performance scales by number of workers.

Currently I have it running compute every timestep immediately for the labelling, to avoid having change any of the surrounding code, and given the array isn't chunked it will likely run slower than the scipy/scikit-image labelling just due to overhead.

From a quick inspection dask-image should have substitutes for most of the image functions we need for feature detection. I'll see if I can get lazy execution of the per timestep feature detection working in a simple manner and see if running with an actual dask client and changing the number of workers affects things.

@w-k-jones
Copy link
Copy Markdown
Member Author

I think that the best way forward here is to keep both a "small data" approach based on the current method for performance on smaller problems, and a "big data" alternative using dask-image that is used if a dask array is passed to feature detection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants