Skip to content
This repository was archived by the owner on Mar 21, 2024. It is now read-only.

Commit 6575304

Browse files
authored
ENH: Publish lung segmentation model (#808)
This PR contains the hyperparameter updates for training and running inference on InnerEye Lung segmentation model on a Standard_NC24rs_v3 Azure cluster (4 x V100). It also includes all the documentation of the model will be released in a tagged version in the same fashion as the [hippocampus model](https://github.com/microsoft/InnerEye-DeepLearning/releases/tag/v0.5). I will tag and release the model after this PR is merged to ensure that the source code in the main branch is correct at the time of release (docs currently include a link to v0.8 which doesn't yet exist).
1 parent 12f78f3 commit 6575304

File tree

5 files changed

+203
-91
lines changed

5 files changed

+203
-91
lines changed

InnerEye/ML/configs/segmentation/Lung.py

Lines changed: 29 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -25,48 +25,50 @@ class Lung(SegmentationModelBase):
2525
def __init__(self, **kwargs: Any) -> None:
2626
fg_classes = ["spinalcord", "lung_r", "lung_l", "heart", "esophagus"]
2727
fg_display_names = ["SpinalCord", "Lung_R", "Lung_L", "Heart", "Esophagus"]
28+
29+
azure_dataset_id = kwargs.pop("azure_dataset_id", LUNG_AZURE_DATASET_ID)
30+
2831
super().__init__(
32+
adam_betas=(0.9, 0.999),
2933
architecture="UNet3D",
30-
feature_channels=[32],
31-
kernel_size=3,
32-
azure_dataset_id=LUNG_AZURE_DATASET_ID,
33-
crop_size=(64, 224, 224),
34-
test_crop_size=(128, 512, 512),
35-
image_channels=["ct"],
36-
ground_truth_ids=fg_classes,
37-
ground_truth_ids_display_names=fg_display_names,
34+
azure_dataset_id=azure_dataset_id,
35+
check_exclusive=False,
36+
class_weights=equally_weighted_classes(fg_classes, background_weight=0.02),
3837
colours=[(255, 255, 255)] * len(fg_classes),
38+
crop_size=(64, 224, 224),
39+
feature_channels=[32],
3940
fill_holes=[False] * len(fg_classes),
40-
roi_interpreted_types=["ORGAN"] * len(fg_classes),
41-
largest_connected_component_foreground_classes=["lung_r", "lung_l", "heart"],
42-
num_dataload_workers=2,
43-
norm_method=PhotometricNormalizationMethod.CtWindow,
44-
level=40,
45-
window=400,
46-
class_weights=equally_weighted_classes(fg_classes, background_weight=0.02),
47-
train_batch_size=8,
41+
ground_truth_ids_display_names=fg_display_names,
42+
ground_truth_ids=fg_classes,
43+
image_channels=["ct"],
4844
inference_batch_size=1,
4945
inference_stride_size=(64, 256, 256),
50-
num_epochs=140,
46+
kernel_size=3,
47+
l_rate_polynomial_gamma=0.9,
5148
l_rate=1e-3,
49+
largest_connected_component_foreground_classes=["lung_l", "lung_r", "heart"],
50+
level=-500,
51+
loss_type=SegmentationLoss.SoftDice,
5252
min_l_rate=1e-5,
53-
l_rate_polynomial_gamma=0.9,
54-
optimizer_type=OptimizerType.Adam,
55-
opt_eps=1e-4,
56-
adam_betas=(0.9, 0.999),
5753
momentum=0.9,
58-
weight_decay=1e-4,
54+
monitoring_interval_seconds=0,
55+
norm_method=PhotometricNormalizationMethod.CtWindow,
56+
num_dataload_workers=2,
57+
num_epochs=300,
58+
opt_eps=1e-4,
59+
optimizer_type=OptimizerType.Adam,
60+
roi_interpreted_types=["ORGAN"] * len(fg_classes),
61+
test_crop_size=(112, 512, 512),
62+
train_batch_size=3,
5963
use_mixed_precision=True,
6064
use_model_parallel=True,
61-
monitoring_interval_seconds=0,
62-
loss_type=SegmentationLoss.SoftDice,
63-
check_exclusive=False,
65+
weight_decay=1e-4,
66+
window=2200,
6467
)
6568
self.add_and_validate(kwargs)
6669

6770
def get_model_train_test_dataset_splits(self, dataset_df: pd.DataFrame) -> DatasetSplits:
68-
# The first 24 subject IDs are the designated test subjects in this dataset.
69-
test = list(map(str, range(0, 24)))
71+
test = list(map(str, range(0, 9)))
7072
train_val = list(dataset_df[~dataset_df.subject.isin(test)].subject.unique())
7173

7274
val = list(map(str, numpy.random.choice(train_val, int(len(train_val) * 0.1), replace=False)))

docs/source/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ InnerEye-DeepLearning Documentation
2222
md/debugging_and_monitoring.md
2323
md/model_diagnostics.md
2424
md/move_model.md
25-
md/hippocampus_model.md
25+
rst/models
2626

2727
.. toctree::
2828
:maxdepth: 1

docs/source/md/hippocampus_model.md

Lines changed: 6 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -1,79 +1,22 @@
1-
# Trained model for hippocampal segmentation
1+
# Hippocampus Segmentation Model
22

33
## Purpose
44

5-
This documentation describes how to use our pre-trained model to segment the left and right hippocampi from brain MRI scans. The model was trained on data from the [ADNI](https://adni.loni.usc.edu/) dataset (for more information see the model card below). This data is publicly available via their website, but users must sign a Data Use Agreement in order to gain access. We do not provide access to the data. The following description assumes the user has their own dataset to evaluate/ retrain the model on.
5+
This documentation describes our pre-trained model for segmentation of the left and right hippocampi from brain MRI scans. The model was trained on data from the [ADNI](https://adni.loni.usc.edu/) dataset (for more information see the model card below). This data is publicly available via their website, but users must sign a Data Use Agreement in order to gain access. We do not provide access to the data. The following description assumes the user has their own dataset to evaluate/ retrain the model on.
66

77
## Terms of use
88

99
Please note that this model is intended for research purposes only. You are responsible for the performance, the necessary testing, and if needed any regulatory clearance for any of the models produced by this toolbox.
1010

11-
---
11+
## Download
1212

13-
## Usage
13+
The hippocampus segmentation model can be downloaded from [this release](https://github.com/microsoft/InnerEye-DeepLearning/releases/tag/v0.5).
1414

15-
The following instructions assume you have completed the preceding setup steps in the [InnerEye README](https://github.com/microsoft/InnerEye-DeepLearning/), in particular, [Setting up Azure Machine Learning](setting_up_aml.md).
16-
17-
### Create an Azure ML Dataset
18-
19-
To evaluate this model on your own data, you will first need to register an [Azure ML Dataset](https://docs.microsoft.com/en-us/azure/machine-learning/v1/how-to-create-register-datasets). You can follow the instructions in the for [creating datasets](creating_dataset.md) in order to do this.
20-
21-
## Downloading the model
22-
23-
The saved weights from the trained Hippocampus model can be downloaded along with the source code used to train it from [our GitHub releases page](https://github.com/microsoft/InnerEye-DeepLearning/releases/tag/v0.5).
24-
25-
### Registering a model in Azure ML
26-
27-
To evaluate the model in Azure ML, you must first [register an Azure ML Model](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.model?view=azure-ml-py#remarks). To register the Hippocampus model in your AML Workspace, unpack the source code downloaded in the previous step and follow InnerEye's [instructions to upload models to Azure ML](move_model.md).
28-
29-
Run the following from a folder that contains both the `ENVIRONMENT/` and `MODEL/` folders (these exist inside the downloaded model files):
30-
31-
```shell
32-
WORKSPACE="fill with your workspace name"
33-
GROUP="fill with your resource group name"
34-
SUBSCRIPTION="fill with your subscription ID"
35-
36-
python InnerEye/Scripts/move_model.py \
37-
--action upload \
38-
--path . \
39-
--workspace_name $WORKSPACE \
40-
--resource_group $GROUP \
41-
--subscription_id $SUBSCRIPTION \
42-
--model_id Hippocampus:118
43-
```
44-
45-
### Evaluating the model
46-
47-
You can evaluate the model either in Azure ML or locally using the downloaded checkpoint files. These 2 scenarios are described in more detail, along with instructions in [testing an existing model](building_models.md#testing-an-existing-model).
48-
49-
For example, to evaluate the model on your Dataset in Azure ML, run the following from within the directory `*/MODEL/final_ensemble_model/`
50-
51-
```shell
52-
CLUSTER="fill with your cluster name"
53-
DATASET_ID="fill with your dataset name"
54-
55-
python InnerEye/ML/runner.py \
56-
--azure_dataset_id $DATASET_ID \
57-
--model Hippocampus \
58-
--model_id Hippocampus:111 \
59-
--experiment_name evaluate_hippocampus_model \
60-
--azureml \
61-
--no-train \
62-
--cluster $CLUSTER
63-
--restrict_subjects=0,0,+
64-
```
65-
66-
### Connected components
15+
## Connected components
6716

6817
It is possible to apply connected components as a post-processing step, although by default this is disabled. To enable, update the property `largest_connected_component_foreground_classes` of the Hippocampus class in `InnerEye/ML/configs/segmentation/Hippocampus.py`
6918

70-
### Deploy with InnerEye Gateway
71-
72-
To deploy this model, see the instructions in the [InnerEye README](https://github.com/microsoft/InnerEye-DeepLearning/).
73-
74-
---
75-
76-
## Hippocampal Segmentation Model Card
19+
## Model Card
7720

7821
### Model details
7922

docs/source/md/lung_model.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Lung Segmentation Model
2+
3+
## Purpose
4+
5+
This model is designed to perform segmentation of CT scans of human torsos. It is trained to identify 5 key structures: left lung, right lung, heart, spinalcord and esophagus.
6+
7+
## Download
8+
9+
The lung segmentation model can be downloaded from [this release](https://github.com/microsoft/InnerEye-DeepLearning/releases/tag/v0.8).
10+
11+
## Connected Components
12+
13+
It is possible to apply connected components as a post-processing step, and by default this is performed on the 3 largest structures: both lungs and the heart. To alter this behaviour, update the property `largest_connected_component_foreground_classes` of the Lung class in `InnerEye/ML/configs/segmentation/Lung.py`.
14+
15+
## Model Card
16+
17+
### Model Details
18+
19+
- Organisation: Biomedical Imaging Team at Microsoft Research, Cambridge UK.
20+
- Model date: 31st October 2022.
21+
- Model version: 1.0.
22+
- Model type: ensemble of 3D UNet. Training details are as described in [this paper](https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2773292).
23+
- Training details: 5 fold ensemble model. Trained on the [LCTSC 2017 dataset](https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=24284539) (described in detail below).
24+
- License: The model is released under MIT license as described [here](https://github.com/microsoft/InnerEye-DeepLearning/blob/main/LICENSE).
25+
- Contact: [email protected].
26+
27+
### Terms of use
28+
29+
Please note that all models provided by InnerEye-DeepLearning are intended for research purposes only. You are responsible for the performance, the necessary testing, and if needed any regulatory clearance for any of the models produced by this toolbox.
30+
31+
### Limitations
32+
33+
The dataset used for training contains only 60 scans, 10 of which are withheld for testing. This limited amount of training data means that the model underperforms on the smaller structures (esophagus and spinalcord) and may not yet generalise well to data samples from outside the dataset.
34+
35+
Furthermore, the dataset description does not contain details on the population of patients used for creating the dataset. Therefore it is not possible to assess whether this model is suitable for use on a target population outside of the dataset.
36+
37+
### Intended Uses
38+
39+
This model is intended for research purposes only. It is intended to be used as a starting-point for more challenging segmentation tasks or training using more thorough and comprehensive segmentation tasks.
40+
41+
### Training Data
42+
43+
This model is trained on the [LCTSC 2017 dataset](https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=24284539). For a detailed description on this data, including the contouring guidelines, see [this page](https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=24284539#242845396723d79f9909442996e4dd0af5e56a30).
44+
45+
The following steps were carrried out to create the dataset used for training this model:
46+
47+
1. Download the DICOM dataset from the above LCTSC 2017 link.
48+
1. Use the [InnerEye-CreateDataset tool](https://github.com/microsoft/InnerEye-CreateDataset) to run the following command on the data:
49+
50+
```shell
51+
.\InnerEye.CreateDataset.Runner.exe dataset --datasetRootDirectory=<path_to_DICOM_data> --niftiDatasetDirectory=lung_nifti --dicomDatasetDirectory=LCTSC --geoNorm 1 1 3 --groundTruthDescendingPriority esophagus spinalcord lung_r lung_l heart
52+
```
53+
54+
1. Upload and register NIFTI dataset to Azure by following the [dataset creation](creating_dataset.md) guide.
55+
56+
### Metrics
57+
58+
Metrics for the withheld test data (first 10 scans in the dataset), can be seen in the following table:
59+
60+
| Structure | count | DiceNumeric_mean | DiceNumeric_std | DiceNumeric_min | DiceNumeric_max | HausdorffDistance_mm_mean | HausdorffDistance_mm_std | HausdorffDistance_mm_min | HausdorffDistance_mm_max | MeanDistance_mm_mean | MeanDistance_mm_std | MeanDistance_mm_min | MeanDistance_mm_max |
61+
|---------------|---------|------------------|-----------------|-----------------|-----------------|---------------------------|--------------------------|--------------------------|--------------------------|----------------------|---------------------|---------------------|---------------------|
62+
| lung_l | 10 | 0.984 | 0.009 | 0.958 | 0.990 | 11.642 | 4.868 | 6.558 | 19.221 | 0.344 | 0.266 | 0.167 | 1.027 |
63+
| lung_r | 10 | 0.983 | 0.009 | 0.960 | 0.991 | 10.764 | 3.307 | 6.325 | 16.156 | 0.345 | 0.200 | 0.160 | 0.797 |
64+
| spinalcord | 10 | 0.860 | 0.050 | 0.756 | 0.912 | 27.213 | 22.015 | 12.000 | 81.398 | 1.750 | 2.167 | 0.552 | 7.209 |
65+
| heart | 10 | 0.935 | 0.015 | 0.908 | 0.953 | 17.550 | 14.796 | 9.000 | 17.550 | 2.022 | 0.661 | 1.456 | 3.299 |
66+
| esophagus | 10 | 0.728 | 0.128 | 0.509 | 0.891 | 23.503 | 25.679 | 6.173 | 72.008 | 3.207 | 4.333 | 0.409 | 13.991 |
67+
| | | | | | | | | | | | | | |

docs/source/rst/models.rst

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
Pre-Trained Models
2+
==================
3+
4+
InnerEye-DeepLearning currently has two pre-trained models avaiable for use
5+
in segmentation tasks. This page describes how to set up and use these models.
6+
For specific information on the models, please refer to the relevant model card:
7+
8+
.. toctree::
9+
:maxdepth: 1
10+
11+
../md/hippocampus_model.md
12+
../md/lung_model.md
13+
14+
15+
Terms of use
16+
------------
17+
18+
Please note that all models provided by InnerEye-DeepLearning are intended for
19+
research purposes only. You are responsible for the performance, the necessary testing,
20+
and if needed any regulatory clearance for any of the models produced by this toolbox.
21+
22+
Usage
23+
-----
24+
25+
The following instructions assume you have completed the preceding setup
26+
steps in the `InnerEye
27+
README <https://github.com/microsoft/InnerEye-DeepLearning/>`__, in
28+
particular, `Setting up Azure Machine Learning <setting_up_aml.md>`__.
29+
30+
Create an AzureML Dataset
31+
~~~~~~~~~~~~~~~~~~~~~~~~~
32+
33+
To evaluate pre-trained models on your own data, you will first need to register
34+
an `Azure ML
35+
Dataset <https://docs.microsoft.com/en-us/azure/machine-learning/v1/how-to-create-register-datasets>`__.
36+
You can follow the instructions in the for `creating
37+
datasets <creating_dataset.md>`__ in order to do this.
38+
39+
Downloading the models
40+
~~~~~~~~~~~~~~~~~~~~~~
41+
42+
The saved weights for each model can be found in their respective :ref:`model cards<Pre-Trained Models>`.
43+
You will need to download the weights and source code for the model that you wish to use.
44+
45+
Registering a model in Azure ML
46+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
47+
48+
To evaluate the model in Azure ML, you must first `register an Azure ML
49+
Model <https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.model?view=azure-ml-py#remarks>`__.
50+
To register the pre-trained model in your AML Workspace, unpack the
51+
source code downloaded in the previous step and follow InnerEye's
52+
`instructions to upload models to Azure ML <move_model.md>`__.
53+
54+
Run the following from a folder that contains both the ``ENVIRONMENT/``
55+
and ``MODEL/`` folders (these exist inside the downloaded model files):
56+
57+
.. code:: shell
58+
59+
WORKSPACE="fill with your workspace name"
60+
GROUP="fill with your resource group name"
61+
SUBSCRIPTION="fill with your subscription ID"
62+
63+
python InnerEye/Scripts/move_model.py \
64+
--action upload \
65+
--path . \
66+
--workspace_name $WORKSPACE \
67+
--resource_group $GROUP \
68+
--subscription_id $SUBSCRIPTION \
69+
--model_id <Model Name>:<Model Version>
70+
71+
Evaluating the model
72+
~~~~~~~~~~~~~~~~~~~~
73+
74+
You can evaluate the model either in Azure ML or locally using the
75+
downloaded checkpoint files. These 2 scenarios are described in more
76+
detail, along with instructions in `testing an existing
77+
model <building_models.md#testing-an-existing-model>`__.
78+
79+
For example, to evaluate the model on your Dataset in Azure ML, run the
80+
following from within the directory ``*/MODEL/final_ensemble_model/``
81+
82+
.. code:: shell
83+
84+
CLUSTER="fill with your cluster name"
85+
DATASET_ID="fill with your dataset name"
86+
87+
python InnerEye/ML/runner.py \
88+
--azure_dataset_id $DATASET_ID \
89+
--model <Model Name> \
90+
--model_id <Model Name>:<Model Version> \
91+
--experiment_name <experiement name> \
92+
--azureml \
93+
--no-train \
94+
--cluster $CLUSTER
95+
--restrict_subjects=0,0,+
96+
97+
Deploy with InnerEye Gateway
98+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
99+
100+
To deploy a model using the InnerEye Gateway, see the instructions in the `Gateway Repo <https://github.com/microsoft/InnerEye-Gateway/>`__.

0 commit comments

Comments
 (0)