Backport PR #3164 on branch 1.3.x (feat: SemiSupervised Training Mixi…

…n class) (#3222) Backport PR #3164: feat: SemiSupervised Training Mixin class Co-authored-by: Ori Kronfeld <[email protected]>
scverse · Feb 27, 2025 · 0142c4a · 0142c4a
1 parent d690af1
commit 0142c4a
Show file tree

Hide file tree

Showing 14 changed files with 558 additions and 220 deletions.
diff --git a/.github/workflows/test_linux_autotune.yml b/.github/workflows/test_linux_autotune.yml
@@ -0,0 +1,70 @@
+name: test (Autotune)
+
+on:
+  push:
+    branches: [main, "[0-9]+.[0-9]+.x"] #this is new
+  pull_request:
+    branches: [main, "[0-9]+.[0-9]+.x"]
+    types: [labeled, synchronize, opened]
+  workflow_dispatch:
+
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: true
+
+jobs:
+  test:
+    # if PR has label "autotune" or "all tests" or if scheduled or manually triggered or on push
+    if: >-
+      (
+        contains(github.event.pull_request.labels.*.name, 'autotune') ||
+        contains(github.event.pull_request.labels.*.name, 'all tests') ||
+        contains(github.event_name, 'schedule') ||
+        contains(github.event_name, 'workflow_dispatch')
+      )
+
+    runs-on: [self-hosted, Linux, X64, CUDA]
+
+    defaults:
+      run:
+        shell: bash -e {0} # -e to fail on error
+
+    container:
+      image: ghcr.io/scverse/scvi-tools:py3.12-cu12-base
+      options: --user root --gpus all --pull always
+
+    name: integration
+
+    env:
+      OS: ${{ matrix.os }}
+      PYTHON: ${{ matrix.python }}
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - uses: actions/setup-python@v5
+        with:
+          python-version: ${{ matrix.python }}
+          cache: "pip"
+          cache-dependency-path: "**/pyproject.toml"
+
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip wheel uv
+          python -m uv pip install --system "scvi-tools[tests] @ ."
+          python -m pip install "jax[cuda]==0.4.35"
+          python -m pip install nvidia-nccl-cu12
+
+      - name: Run pytest
+        env:
+          MPLBACKEND: agg
+          PLATFORM: ${{ matrix.os }}
+          DISPLAY: :42
+          COLUMNS: 120
+        run: |
+          coverage run -m pytest -v --color=yes --autotune-tests --accelerator cuda --devices auto
+          coverage report
+
+      - uses: codecov/codecov-action@v4
+        with:
+          token: ${{ secrets.CODECOV_TOKEN }}
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -6,7 +6,7 @@ to [Semantic Versioning]. Full commit history is available in the
 
 ## Version 1.3
 
-### 1.3.0 (2025-02-XX)
+### 1.3.0 (2025-02-28)
 
 #### Added
 
@@ -18,15 +18,15 @@ to [Semantic Versioning]. Full commit history is available in the
 - Add an exception callback to {class}`scvi.train._callbacks.SaveCheckpoint` in order to save
     optimal model during training, in case of failure because of Nan's in gradients. {pr}`3159`.
 - Add {meth}`~scvi.model.SCVI.get_normalized_expression` for models: {class}`~scvi.model.PEAKVI`,
-    {class}`~scvi.external.PoissonVI`, {class}`~scvi.model.CondSCVI`, {class}`~scvi.model.AutoZI`,
-    {class}`~scvi.external.CellAssign` and {class}`~scvi.external.GimVI`. {pr}`3121`.
+    {class}`~scvi.external.POISSONVI`, {class}`~scvi.model.CondSCVI`, {class}`~scvi.model.AUTOZI`,
+    {class}`~scvi.external.CellAssign` and {class}`~scvi.external.GIMVI`. {pr}`3121`.
 - Add {class}`scvi.external.RESOLVI` for bias correction in single-cell resolved spatial
     transcriptomics {pr}`3144`.
+- Add semisupervised training mixin class {class}`scvi.model.base.SemisupervisedTrainingMixin` {pr}`3164`.
 - Add scib-metrics support for {class}`scvi.autotune.AutotuneExperiment` and
     {class}`scvi.train._callbacks.ScibCallback` for autotune for scib metrics {pr}`3168`.
 - Add Support of dask arrays in AnnTorchDataset. {pr}`3193`.
-- Add a [use cases](%22https://docs.scvi-tools.org/en/latest/user_guide/index.html#common-use-cases%22)
-    section in the docs, {pr}`3200`.
+- Add a {doc}`/user_guide/use_case` section in the docs, {pr}`3200`.
 - Add {class}`scvi.external.SysVI` for cycle consistency loss and VampPrior {pr}`3195`.
 
 #### Fixed

diff --git a/docs/tutorials/index.md b/docs/tutorials/index.md
@@ -19,7 +19,7 @@ index_scbs
 index_multimodal
 index_spatial
 index_hub
-index_tuning
+index_use_cases
 index_dev
 ```
 

diff --git a/docs/tutorials/index_hub.md b/docs/tutorials/index_hub.md
@@ -6,5 +6,4 @@
 notebooks/hub/cellxgene_census_model
 notebooks/hub/scvi_hub_intro_and_download
 notebooks/hub/scvi_hub_upload_and_large_files
-notebooks/hub/minification
 ```
diff --git a/docs/tutorials/index_tuning.md b/docs/tutorials/index_tuning.md
diff --git a/docs/tutorials/index_use_cases.md b/docs/tutorials/index_use_cases.md
@@ -0,0 +1,9 @@
+# Common Modelling Use Cases
+
+```{toctree}
+:maxdepth: 1
+
+notebooks/use_cases/autotune_scvi
+notebooks/use_cases/minification
+notebooks/use_cases/interpretability
+```
diff --git a/docs/user_guide/use_case/downstream_analysis_tasks.md b/docs/user_guide/use_case/downstream_analysis_tasks.md
@@ -22,6 +22,7 @@ You can compare the expression of genes between clusters to identify which genes
 differential_expression = scvi.model.SCVI().differential_expression()
 ```
 Log-fold Change (LFC) and p-values are typically used to assess which genes have significant expression differences between groups.
+Refer to [SCVI-Hub]("https://huggingface.co/scvi-tools") for use cases of DE.
 3. Cell Type Identification
 Mapping to Known Labels: After training a model with SCVI, you can use the latent space to assign cells to known or predicted cell types. You can compare how well SCVI clusters cells by their latent representations and match them to known biological annotations.
 If you have labeled data (e.g., cell types), you can assess how well the model’s clusters correspond to these labels.

diff --git a/pyproject.toml b/pyproject.toml
@@ -55,7 +55,6 @@ dependencies = [
     "torchmetrics>=0.11.0",
     "tqdm",
     "xarray>=2023.2.0",
-    "dask",
 ]
 
 [project.optional-dependencies]
@@ -98,10 +97,12 @@ scanpy = ["scanpy>=1.10", "scikit-misc"]
 file_sharing = ["pooch"]
 # for parallelization engine
 parallel = ["dask[array]>=2023.5.1,<2024.8.0"]
+# for supervised models interpretability
+interpretability = ["captum","shap"]
 
 
 optional = [
-    "scvi-tools[autotune,aws,hub,file_sharing,regseq,scanpy,parallel]"
+    "scvi-tools[autotune,aws,hub,file_sharing,regseq,scanpy,parallel,interpretability]"
 ]
 tutorials = [
     "cell2location",
@@ -135,6 +136,7 @@ markers = [
     "optional: mark optional tests, usually take more time",
     "private: mark tests that uses private keys, like HF",
     "multigpu: mark tests that are used to check multi GPU performance",
+    "autotune: mark tests that are used to check ray autotune capabilities",
 ]
 
 [tool.ruff]