Skip to content

Pre-trained feature extraction#24

Closed
C-Achard wants to merge 38 commits intoweigertlab:mainfrom
C-Achard:cy/embed-extract
Closed

Pre-trained feature extraction#24
C-Achard wants to merge 38 commits intoweigertlab:mainfrom
C-Achard:cy/embed-extract

Conversation

@C-Achard
Copy link
Contributor

@C-Achard C-Achard commented Feb 24, 2025

Allows using pre-trained backbones for feature generation.

Backbones :

  • SAM
  • DINOv2
  • Hiera
  • SAM2
  • micro-SAM
  • TAP

Embedding selection mode :

  • Nearest
  • Mean of patches -> Current implementation is a bit slow

  • Data augmentation - Done

  • Add latest commits from main
  • Clean-up of unnecessary loose files and debug utilities

C-Achard and others added 8 commits March 21, 2025 17:46
* (WIP) Refactor feat_dim in CTCData + add pretrained_feats mode to train.py

* (WIP) Training with pretrained features

* Update data.py

* Fix training on several folders

* Fix SAM2 input range

* Add micro SAM

* Skip SAM2 under-the-hood preprocessing

* Fix dataset pickling issue with pretrained_config

* Minor fixes and tweaks for SAM2 features

* Small train fixes

* Revert reassign ops to inplace, enable per param clipping

* Update train.py

* (WIP) Update API for pred to use pretrained_feats

* Update pretrained_features.py

* Add input proj dropout

* Remove per param grad clipping, add input lin dropout

* Update train.py

* Add weight decay param

* Add regionprops features to WRPretrainedFeats

* Fix incorrect dims of WRFeat features

* Revert "Fix incorrect dims of WRFeat features"

This reverts commit a597bf4.

* Fix feat dim mismatch when using additional feats in WRPretrained Feats

* Add NaN guards earlier in training

* Update train.py

* Update train.py

* Contiune debugging empty feats

* (WIP) Debug NaN loss

* Disable autocast for proj layer due to NaNs

* EXPERIMENTAL Self-attention in input

* Revert "EXPERIMENTAL Self-attention in input"

This reverts commit 05b8262.

* EXPERIMENTAL Additional linear for input features

* EXPERIMENTAL Fix forward and add input layernorm

* EXPERIMENTAL Normalize features at input

* Add param for additional layer

* Fix WRFeat dimension

* Several checks/fixes

* Remove dim check

* Remove input normalization, add batch norm

* Fix norm dim

* Disable batch norm affine

* Add max patches mode

* Update pretrained_features.py

* Add from_config and additional features from inference

* Update train.py

* Update model.py

* Add random features

* EXPERIMENTAL PCA preprocessor

* Fix max frames in PCA

* Add wrfeat demo

* Revert "Add wrfeat demo"

This reverts commit 84754ba.

* Update wrfeat.py

* Merge pull request weigertlab#27 from anwai98/patch-1

Remove verbosity argument in LRScheduler

* API for augmented SAM2 features (#3)

* WIP - API for augmented SAM2 features

* Disable PCA for now

* Fix for aug pretrained features

* WIP CTCDataAugPretrainedFeats

* Functional augmented CTCDataPretrained

* Update train.py

* Fix saving pre-trained models to pkl

* Add sampling from disk + fix val being augmented

* Fix h5 loading errors

* Try to fix wrong CTCData class being used after caching

* Fix incorrect dataset_kwargs use

* Fix dataset kwargs issue

* Update augs

* Update pretrained_features.py

* Test no aug

* Update pretrained_features.py

* Add online coords augmentation

* Fix key error

* Update pretrained_features.py

* Fix key dtype mismatch

* Fix empty frame handling

* Fix cropping missing A

* Update data.py

* Fix aug data pipeline

* Update data.py

* Fix A cropping

* Fix cropping return dtype

* Update data.py

* EXP Disable pos

* Update model.py

* Update model.py

* Stronger augmentations

* Update pretrained_features.py

* Add use coords toggle in model

* Augmentation fixes

* Add use_coords config+arg

* Replace WRAugContainer with WRFeat subclass

* Update lint Github action

* Remove numerize dep

* Fix training for CPU debugging

* Update models metadata and Readme

* Update model URLs

* Remove numpy 2 pin

* Fix model loading API

* Humanize dep

* WIP Regionprops features w/ offline aug pretrained feats

* use_coords=False uses time information

* Fix addtional feats arg passing

* Fix additional feats exp

* Many args and training improvements/fixes

* Fix get_features

* New regionprops

* Fixes for AugPretrainedFeats loading

* TAP Features (#4)

* WIP TAPFeatures

* WIP TAP Features

* Fixes for TAP feats

* Functional TAP feats training

* Add CellposeSAM

* Fix save path recursion bug

* Update pretrained_features.py

* SAM2 high-res features

* try object-level additional encoding

* EXP Feature rotation based on aug coords

* Fixes for feature rotation transform

* Update pretrained_features.py

* Feats dimensions and rotation fixes

* Add new aug level

* Debug model w/ encoded labels

* Additional LayerNorm for features

* Add RandomScale augment

* Fix new LayerNorm dim

* Update deps + remove cropping debug message

* Updated pretrained features API (#6)

* WIP Update model API for easier handling of pretrained feats

* WIP Update pretrained_feats in model

* WIP Fix dims

* FInal fiexes for new API

* Disable skip crop + fix issue in wrong pretrained feats shape loading

* Fix mean patches

* Debug aug feats

* Fix wrong timepoint shift in aug pretrained feats

* Remove expand_dims args

* Update data.py

* Update train.py

* Disable intensity values aug for now

* Update normalization for pretrained augs

To re-introduce intensity values augmentations

* Update pretrained_features.py

* Fix feature mismatch btw val and train for mean patches

* Update pretrained_features.py

* Disable early stop + cfg + pt_reduced dim in cfg + fix disable xy coords + disable intens. augs

* Update pretrained augs caching and augs API

* Fix wrong arg name in model cfg

* More dims for feat rotation + debug mean patches

Also fix caching issues when running w/ and w/o regionprops between runs

* Add CoTracker

* Change default augs

* Fix CoTracker input size change for augs

* Fix prediction for latest pretrained feats API

* EXP test norm feat

* Prediction fixes

* Update pretrained_features.py

* Add explicit masks ref in agg_patches_exact

* Fix misaligned timepoint use in mean_patches

* Update feature normlaization

* GaussianBLur + pretrained feats augmentations tweaks

* Norm after mean

* Revert previous; norm before mean

* Update from main (#5)

* Extend installation instructions

* Add docstrings;cursorrules

* Add docstrings;cursorrules

* Add docstrings;cursorrules

* merged

* Fixes + __init__ for pretrained features

* Add missing border_dist_fast

---------

Co-authored-by: Benjamin Gallusser <[email protected]>
Co-authored-by: Martin Weigert <[email protected]>

* Improve problematic h5 group deletion

* Clean up feature extractor

* Remove deprecated PCA args + best config

* Inference fixes

* Add median aggregation

* Use small epsilon in empty features agg to avoid NaNs downstream

* Add support for dask arrays

* Add several configs

* Add ckpt file path option

* h5 swmr + cfgs

* Add missing median mode

* Fix train args pt mode

* WIP No h5 offline augmentation

* Configs

* Load aug pt feats from RAM  by default

* Add image shape record for aug feats

* Fix mistake in image shape use in rotate feats

* Fix GaussianBlur record

* Add parallel aug computation

* Disable norm + small tweaks

* Fix h5 image shape loading

* Improve feature extraction for parallel augs + fix Cotracker

* Fix typo in precompute_image_embeddings

* Handle missing objects

* Setting  for seed in Random features

* Fix None seed error

* Handle occasional missing labels due to augs

* Update deepcell cfg

* Zarr caching for augmentations (#7)

* Change data caching to .zarr and improve parallelization

* Make image percentile norm consistent + update missing label msg

* Fix CellposeSAM norm step

* Fix feature mismatch

* Fix load from disk

* Disable parallel aug for CoTracker and TAP for now

* microSAM fixes + zarr install

* Update pretrained_features.py

---------

Co-authored-by: Benjamin Gallusser <[email protected]>
Co-authored-by: Martin Weigert <[email protected]>

---------

Co-authored-by: Benjamin Gallusser <[email protected]>
Co-authored-by: Martin Weigert <[email protected]>

---------

Co-authored-by: Benjamin Gallusser <[email protected]>
Co-authored-by: Martin Weigert <[email protected]>

---------

Co-authored-by: Martin Weigert <[email protected]>
Co-authored-by: Benjamin Gallusser <[email protected]>
Co-authored-by: Martin Weigert <[email protected]>
@C-Achard C-Achard marked this pull request as ready for review August 6, 2025 09:20
@C-Achard C-Achard changed the base branch from update-train-mw to main August 6, 2025 09:20
@C-Achard C-Achard marked this pull request as draft August 6, 2025 09:21
@C-Achard
Copy link
Contributor Author

See #39

@C-Achard C-Achard closed this Sep 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants