Skip to content

NF: Visualize the latent space#245

Open
levje wants to merge 32 commits intoscil-vital:masterfrom
levje:levje/viz-latent-space
Open

NF: Visualize the latent space#245
levje wants to merge 32 commits intoscil-vital:masterfrom
levje:levje/viz-latent-space

Conversation

@levje
Copy link
Copy Markdown
Collaborator

@levje levje commented Sep 27, 2024

Description

It was asked to be able to visualize the latent space based on #220 w.r.t to FINTA from Legarreta et al (2021). As in the original paper, we are projecting the latent space coming out of the auto-encoder into 2D using t-SNE which preserves a smaller distance for similar streamlines and a higher distance for different streamlines.

The class latent_streamlines.py:BundlesLatentSpaceVisualizer is the bulk of the changes done and was made to potentially be reused for other data that needs to be projected and plotted in 2D. Each time we reach an epoch where the loss is minimal compared to what we encountered so far, we plot the latent space of that new epoch.

(Having a future PR adding hooks everywhere in the trainer/models in a similar fashion to LightningAI or PyTorch.nn.Module would add more flexibility to the library in my opinion!)

Scripts:

  • ae_train_model.py : Same script as added by [NF] Auto-encoders - streamlines - FINTA #220, with the additional part to be able to automatically plot/save the figures at each interval of epochs given by the new argument --viz_latent_space .

Testing data and script

ae_train_model.py \
        $experiments \
	$experiment_name \
	$o_hdf5 \
	target \
	-v INFO \
	--batch_size_training 1200 \
	--batch_size_units nb_streamlines \
	--nb_subjects_per_batch 5 \
	--learning_rate 0.001 \
	--weight_decay 0.13 \
	--optimizer Adam \
	--max_epochs 1000 \
	--max_batches_per_epoch_training 20 \
	--comet_workspace <comet_workpace> \
	--comet_project dwi_ml-ae-fibercup \
	--patience 100 \
	--viz_latent_space \
        --color_by 'dps_bundle_index'
        --bundles_mapping <file with a mapping to bundles>

Have you

  • Added a description of the content of this PR above
  • Followed proper commit message formatting
  • Added data and a script to test this PR
  • Made sure that PEP8 issues are resolved
  • Tested the code yourself right before pushing
  • Added unit tests to test the code you implemented

People this PR concerns

@arnaudbore @AntoineTheb

@levje levje added the enhancement New feature or request label Sep 27, 2024
@levje levje self-assigned this Sep 27, 2024
@pep8speaks
Copy link
Copy Markdown

pep8speaks commented Sep 27, 2024

Hello @levje, Thank you for updating !

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2024-11-08 16:45:44 UTC

@levje levje changed the title [WIP] Visualize the latent space from the auto encoder Visualize the latent space from the auto encoder Oct 2, 2024
@levje
Copy link
Copy Markdown
Collaborator Author

levje commented Oct 2, 2024

@EmmaRenauld if you have an idea on how I could implement commit f0973ff differently, please let me know. It's a little "à bric-à-brac".

@levje levje changed the title Visualize the latent space from the auto encoder NF: Visualize the latent space Oct 6, 2024
@levje levje changed the title NF: Visualize the latent space NF: Visualize the latent space and pack data_per_streamline in the batch loader. Oct 6, 2024
@levje
Copy link
Copy Markdown
Collaborator Author

levje commented Oct 6, 2024

After a few rounds of cleanup and after our discussion this Thursday about the utility of visualizing the current latent space evolution, I stripped the code to only print the latent space of the best epoch. So, each time we get a new best epoch, the latent space will be plotted and saved. This simplifies the code a lot and makes it simpler and more reusable.

To plot "when a new best epoch found", I figured it would be a lot cleaner if I just have a function that can be called within the BestEpochMonitor, which is the newest addition to the modifications we talked about.

It should also now be fine whenever we don't specify any data_per_streamline in the HDF5 file (there will only be one color) and it will also work if you don't specify the bundle_index. @arnaudbore, from what I tested, Fibercup should be working fine now, let me know otherwise.

Finally, just to make it clear, I modified the structure of the HDF5 to have a group '<subject-id>/target/data_per_streamline/bundle_index'. Each entry/dataset within the data_per_streamline group in the HDF5 will be loaded in the dictionary as a numpy array to be included in the returned sft.

Copy link
Copy Markdown
Collaborator

@arnaudbore arnaudbore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of having the color class in the latent space one, also looking at the output I feel like the colors could also be better chosen. Apart from that LGTM !

Comment thread scripts_python/ae_train_model.py Outdated
Comment thread scripts_python/ae_train_model.py Outdated
@levje levje changed the title NF: Visualize the latent space and pack data_per_streamline in the batch loader. NF: Visualize the latent space Dec 12, 2024
@stale
Copy link
Copy Markdown

stale Bot commented Apr 26, 2025

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs, but you will still be able to re-open it. Thank you for your contributions.

@stale stale Bot added the wontfix This will not be worked on label Apr 26, 2025
@stale stale Bot closed this Jul 19, 2025
@EmmaRenauld EmmaRenauld reopened this Jul 21, 2025
@stale stale Bot removed the wontfix This will not be worked on label Jul 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants