Add ability to detect "families" by freemansw1 · Pull Request #551 · tobac-project/tobac

freemansw1 · 2026-01-19T23:33:56Z

I've talked about this before; this is (finally) the code for family detection (family tracking still to come). This code links features together in space.

For the reviewers (I'm leaning toward two reviewers given that this is a new concept for tobac), please check the following:

Documentation makes sense
Example works and makes sense
This works for your data

Note: this is targeted at a new branch, RC_v1.7.0.

@w-k-jones and/or @JuliaKukulies are you up for reviewing? I know this is for v1.7.0, but I'd like to keep this moving so I can implement family tracking soon.

# Conflicts: # tobac/tests/test_utils.py # tobac/utils/general.py # tobac/utils/internal.py

# Conflicts: # tobac/feature_detection.py

… a family

codecov · 2026-01-19T23:36:25Z

Codecov Report

❌ Patch coverage is 90.86957% with 21 lines in your changes missing coverage. Please review.
✅ Project coverage is 65.75%. Comparing base (fcb7cd3) to head (9f12f17).

Files with missing lines	Patch %	Lines
tobac/merge_split/families/feature_family_id.py	88.88%	10 Missing ⚠️
tobac/utils/internal/label_functions.py	93.33%	6 Missing ⚠️
tobac/utils/datetime.py	75.00%	3 Missing ⚠️
tobac/utils/general.py	91.30%	2 Missing ⚠️

Additional details and impacted files

@@              Coverage Diff              @@
##           RC_v1.7.0     #551      +/-   ##
=============================================
+ Coverage      64.84%   65.75%   +0.91%     
=============================================
  Files             27       31       +4     
  Lines           3985     4135     +150     
=============================================
+ Hits            2584     2719     +135     
- Misses          1401     1416      +15

Flag	Coverage Δ
unittests	`65.75% <90.86%> (+0.91%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2026-01-19T23:36:29Z

Linting results by Pylint:

Your code has been rated at 8.36/10 (previous run: 8.36/10, +0.00)
_{The linting score is an indicator that reflects how well your code version follows Pylint’s coding standards and quality metrics with respect to the RC_v1.7.0 branch.

A decrease usually indicates your new code does not fully meet style guidelines or has potential errors.}

freemansw1 · 2026-01-27T04:50:47Z

Blocked by #553 . Also, need to add in a test and handling if there are no families.

w-k-jones · 2026-01-27T10:21:59Z

@freemansw1 I'll be happy to review this after #554 is merged

w-k-jones · 2026-02-11T12:44:41Z

Some overall thoughts before I do an in depth review:

First off, really nice addition! On your particular points, the documentation is nice, but there is nothing in the user guide section (the same is true for merge/split actually), and the docstring for identify_feature_families_from_data seems to copy that from identify_feature_families_from_segmentation rather than describe what it actually does. The example is nice, and I'll have a go at running it with other data.

General thoughts:

We should decide on a fixed term for a collection of features at the same time step. I have used "cluster" before but am happy to switch to "family" as that avoids confusion with other clustering methods. We should change the feature_family_id column to be named family to keep the same pattern with feature, cell, track etc.
Does this belong as part of merge/split, or its own module? I'm undecided, but leaning towards merge/split being for combining tracks over time, whereas this is for combining features on individual time steps.
I recommend moving identify_feature_families from utils.general to the same module as the other family functions to avoid unnecessary coupling (unless identify_feature_families is used in other modules)
It would be nice to add a function to merge/split cells based on detected families (maybe in a future PR)

I also noticed that the PBC labelling at current only uses connectivity=1 for connecting labels across borders, but I can fix that as part of the feature detection refactoring

freemansw1 · 2026-03-03T03:31:43Z

bug to investigate/test: grid output increments rather than carrying through 0 values in label with multiple times

freemansw1 · 2026-03-11T03:05:14Z

bug to investigate/test: grid output increments rather than carrying through 0 values in label with multiple times

This has been resolved with the latest commit.

w-k-jones · 2026-03-27T16:38:18Z