diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..ecc1662 --- /dev/null +++ b/.gitignore @@ -0,0 +1 @@ +Task_2/*.tar.gz diff --git a/README.md b/README.md index 582b4b2..9f3ba4a 100644 --- a/README.md +++ b/README.md @@ -1 +1,223 @@ -# HECKTOR2025 \ No newline at end of file +# HECKTOR2025 - Challenge + +

+ +

+ +--- + + +# â„šī¸ About + +This repository contains the submission template and instructions for the [Grand Challenge 2025](https://hecktor25.grand-challenge.org/hecktor25/) docker-based inference task. Follow this guide to install Docker, run the baseline inference, observe challenge restrictions, save your container, and prepare your submission. + +--- +# 📑 Table of Contents + +* đŸ› ī¸ [Installation](#-installation) +* 🤖 [Baseline Inference](#-baseline-inference) +* âš ī¸ [Restrictions and Submission Tips](#-restrictions-and-submission-tips) +* 💾 [Saving and Uploading Containers](#-saving-and-uploading-containers) + +--- +# đŸ› ī¸ Installation + +## Windows + +1. Download Docker Desktop for Windows: [https://www.docker.com/products/docker-desktop](https://www.docker.com/products/docker-desktop) +2. Run the installer and follow on-screen instructions. +3. Ensure Docker is running by opening PowerShell and executing: + + ```bash + docker --version + ``` + +## macOS + +1. Download Docker Desktop for Mac: [https://www.docker.com/products/docker-desktop](https://www.docker.com/products/docker-desktop) +2. Open the `.dmg` file and drag the Docker app to Applications. +3. Launch Docker and verify: + + ```bash + docker --version + ``` + +## Linux + +### Install prerequisites + + As most of the participants might be using Linux, so we provide detailed steps to set-up the docker. + To create and test your docker setup, you will need to install [Docker Engine](https://docs.docker.com/engine/install/) + and [NVIDIA-container-toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) (in case you need GPU computation). You can follow the instructions in the links to install the prerequisites on your system. + +- **Docker Engine:** To verify that docker has been successfully installed, please run ```docker run hello-world``` + +- **Nvidia-container-toolkit:** To verify you can access gpu inside docker by running ```docker run --rm --gpus all nvidia/cuda:12.1.1-runtime-ubuntu22.04 nvidia-smi``` + +--- +# 🤖 Baseline Inference + +Below is a structure of the ```template-docker``` branch. We support three separate tasks: **Task1**, **Task2**, and **Task3**. For each task, you should have a separate dedicated model under its folder. + +1. **Repository Structure** + + ```text + ├── Task1/ + │ ├── resources/ # Place your model files here + │ ├── requirements.txt # Modify only to add new packages + │ └── ... + ├── Task2/ + │ ├── resources/ + | └── ... + ├── Task3/ + │ ├── resources/ + │ ├── Dockerfile.template # Base Dockerfile for reference + │ ├── do_build.sh # Script to build container + │ ├── do_test_run.sh # Script to test container locally + │ ├── do_save.sh # Script to save container as tarball + │ └── inference.py # Entry point: loads models, runs inference + | └── ... + ``` + +2. **Model Files and Packages** + + * Place all model weights, configuration files, or auxiliary code inside the `resources/` folder of the corresponding Task directory as this is the only directory where you can place your supporting files. + * You **may** update `requirements.txt` within each Task folder to install any additional Python packages needed by your model. + +3. **Working Directory** + + * All input and output files during inference must be read from or written to the `/tmp/` directory inside the container. + +4. **Build the Container** + * To build your container, you can run ```do_build.sh``` file. + ```bash + # From repo root + ./do_build.sh # builds an image tagged `challenge:` + ``` + +5. **Local Test Run** + * The ```do_test_run.sh``` file can be used to test the container on local machines before submitting the finalized version. + ```bash + # Runs inference locally mounting /tmp data + ./do_test_run.sh + ``` + +6. **Save Container** + * You can use ```do_save.sh``` file to save your docker container. + ```bash + ./do_save.sh + ``` + +7. **Performing Inference** + + * `inference.py` is the entry point script executed at container runtime. You can implement or call your model-loading and prediction code here. + + + + + + +# âš ī¸ Grand Challenge Restrictions & Submission Tips + +This section is to guide participants through the [submission tips](https://grand-challenge.org/documentation/making-a-challenge-submission/#submission-tips) documentation. This includes important information like: + +1. **Algorithm Submission** + - You do not need to create a new algorithm for each submission. + - If you update your algorithm, don't forget to make a new submission to the challenge with it, this will not happen automatically. For more guidelines on how to creat a submission on Grand-Challenge and upload your algorithm, please follow the intructions [here](submission-guidelines.md) + +2. **Offline Execution Only** + Your container **must not** attempt any network access (HTTP, SSH, DNS, etc.). Any outgoing connection will cause automatic disqualification. + +3. **Computational & Memory Constraints** + - **GPU**: Your code will run on NVIDIA T4 Tensor Core GPU with 16 GB VRAM. Please design the model so that it should be able to execute on this GPU. + - **Memory Limit**: Peak RAM usage must stay under **16 GB**. + - **Docker Size**: The container you upload for your algorithm cannot exceed 10GB. + +4. **Filesystem Write Permissions** + All writes (models, logs, outputs) **must** go under `/tmp/`. Writing elsewhere on the filesystem will be ignored or blocked. + +5. **I/O Interface** + - **Input**: read exclusively from `/input/` + - **Output**: write exclusively to `/output/` + - **No Extra Files**: do not generate caches or logs in other directories. + +6. **Time Limit** + Tasks 1 and 3 have a **10-minute** wall-clock limit, while task 2 has a **15-minute** limit. Any process running longer will be force-terminated. + +7. **Submission Tips** + - **Local Validation**: always run `./do_test_run.sh` before packing. + - **Save Your Container**: use `./do_save.sh` to generate a `_submission.tar.gz` (max **2 GB**). + - **Naming Convention**: name archives as `submission_task1.tar.gz`, `submission_task2.tar.gz`, etc. + - **Double-Check**: ensure `TaskX/resources/` contains all model artifacts and updated `requirements.txt`. + +8. **Common Error Messages** + | Error Text | Likely Cause | Fix | + |-------------------------------------|-------------------------------------------------|-------------------------------------| + | `Model file not found` | Missing weights in `TaskX/resources/` | Add your `.pth`/`.onnx` files | + | `ModuleNotFoundError: â€Ļ` | Dependency not declared | Update `requirements.txt` & rebuild | + | `Permission denied: '/some/path'` | Writing outside `/tmp/` | Redirect writes to `/tmp/` | + | `Killed` or `OOM` | Exceeded memory limit | Reduce batch size or model footprint| + | `Timeout` | Exceeded runtime limit | Optimize preprocessing/inference | + +--- + + +# 💾 Saving and Uploading Containers + + + +1. **Save to tarball**: + + ```bash + ./do_save.sh + ``` + +2. **Upload to Sanity Check**: + + In the HECKTOR challenge, we have three tasks (`Task 1 - Detection and Segmentation`, `Task 2 - Prognosis`, and `Task 3 - Classification`) and for each task, participants compete in three phases. So here, the task submission is divided into 3 phases: + + - **Sanity Check Phase:** Consists of 3 images to ensure participants are familiar with the Grand Challenge platform and that their dockers run without errors. All teams must make their submission to this phase and will receive feedback on any errors. + - **Validation Phase:** Consists of approximately 50 images. All teams will submit up to 2 working dockers from the sanity check to this phase. Only the top 15 teams, as ranked by the evaluation metrics displayed on the public validation leaderboard, with valid submissions will proceed to Phase 3. + - **Testing Phase:** Consists of approximately 400 images. The teams will choose 1 of their 2 dockers from the validation phase to be submitted to the testing phase. The official ranking of the teams will be based solely on the testing phase results. + + + > **NOTE:** The participants will not receive detailed feedback during the testing phase except for error notifications. + + We also provided some of the requirements for the submission to be valid which can be found on the [Submission webpage](https://hecktor25.grand-challenge.org/submission-instructions/). To start with your submission, for each task on either phases, login to the [Grand Challenge](https://hecktor25.grand-challenge.org/) and click on the link [here](https://hecktor25.grand-challenge.org/evaluation/sanity-check-task-1/submissions/create/). In order to proceed with the submission, please do make sure to follow the guidelines given in ["Submission Guidelines"](https://github.com/BioMedIA-MBZUAI/HECKTOR2025/blob/main/doc/submission-guidelines.md) with visual examples: + +

+ +

+ + On the top region, you can select for which phase and task you are submitting your method. Assuming that we want to test it for the Task 1 on the **Sanity Check Phase**, we select the "**Sanity Check - Task 1**" tab and select the uploaded algorithm from the drop-down list as shown below: + +

+ +

+ + Finally, by clicking on the "**Save**" button you will submit your algorithm for evaluation on the challenges task. The process is the same for all the tasks and phases. + + * Log in to the challenge portal. + * Navigate to **My Submissions** → **Upload Container**. + * Select `my_submission.tar` and submit. + +--- +# 🎉 Good luck with your submission! +If you need support, post questions on the [Challenge Forum](https://grand-challenge.org/forums/forum/head-and-neck-tumor-lesion-segmentation-diagnosis-and-prognosis-767/) or send an email at ```hecktor.challenge@gmail.com```. \ No newline at end of file diff --git a/Task_1/.gitattributes b/Task_1/.gitattributes new file mode 100644 index 0000000..e61a56b --- /dev/null +++ b/Task_1/.gitattributes @@ -0,0 +1 @@ +resources/checkpoints/* filter=lfs diff=lfs merge=lfs -text \ No newline at end of file diff --git a/Task_1/.gitignore b/Task_1/.gitignore new file mode 100644 index 0000000..ccca3e2 --- /dev/null +++ b/Task_1/.gitignore @@ -0,0 +1,3 @@ +resources/checkpoints/* +!resources/checkpoints/.gitkeep + diff --git a/Task_1/Dockerfile b/Task_1/Dockerfile new file mode 100644 index 0000000..5dafbca --- /dev/null +++ b/Task_1/Dockerfile @@ -0,0 +1,27 @@ +# Use a 'large' base container to show-case how to load pytorch (macOS) +# FROM --platform=linux/arm64 pytorch/pytorch AS example-task2-arm64 + +# Use a 'large' base container to show-case how to load pytorch and use the GPU (when enabled) (Linux and WSL) +FROM --platform=linux/amd64 pytorch/pytorch:2.6.0-cuda12.4-cudnn9-runtime AS example-task3-amd64 + +# Ensures that Python output to stdout/stderr is not buffered: prevents missing information when terminating +ENV PYTHONUNBUFFERED=1 + +RUN groupadd -r user && useradd -m --no-log-init -r -g user user +USER user + +WORKDIR /opt/app + +COPY --chown=user:user requirements.txt /opt/app/ +COPY --chown=user:user resources /opt/app/resources + +# You can add any Python dependencies to requirements.txt +RUN python -m pip install \ + --user \ + --no-cache-dir \ + --no-color \ + --requirement /opt/app/requirements.txt + +COPY --chown=user:user inference.py /opt/app/ + +ENTRYPOINT ["python", "inference.py"] diff --git a/Task_1/do_build.sh b/Task_1/do_build.sh new file mode 100644 index 0000000..ebcd00b --- /dev/null +++ b/Task_1/do_build.sh @@ -0,0 +1,18 @@ +#!/usr/bin/env bash + +# Stop at first error +set -e + +SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) +DOCKER_IMAGE_TAG="example-algorithm-sanity-check-task-1" + + +# Check if an argument is provided +if [ "$#" -eq 1 ]; then + DOCKER_IMAGE_TAG="$1" +fi + +# Note: the build-arg is JUST for the workshop +docker build "$SCRIPT_DIR" \ + --platform=linux/amd64 \ + --tag "$DOCKER_IMAGE_TAG" 2>&1 \ No newline at end of file diff --git a/Task_1/do_save.sh b/Task_1/do_save.sh new file mode 100644 index 0000000..9cb4e77 --- /dev/null +++ b/Task_1/do_save.sh @@ -0,0 +1,37 @@ +#!/usr/bin/env bash + +# Stop at first error +set -e + +SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) + +# Set default container name +DOCKER_IMAGE_TAG="example-algorithm-sanity-check-task-1" + +# Check if an argument is provided +if [ "$#" -eq 1 ]; then + DOCKER_IMAGE_TAG="$1" +fi + +echo "=+= (Re)build the container" +source "${SCRIPT_DIR}/do_build.sh" "$DOCKER_IMAGE_TAG" + +# Get the build information from the Docker image tag +build_timestamp=$( docker inspect --format='{{ .Created }}' "$DOCKER_IMAGE_TAG") + +if [ -z "$build_timestamp" ]; then + echo "Error: Failed to retrieve build information for container $DOCKER_IMAGE_TAG" + exit 1 +fi + +# Format the build information to remove special characters +formatted_build_info=$(echo $build_timestamp | sed -E 's/(.*)T(.*)\..*Z/\1_\2/' | sed 's/[-,:]/-/g') + +# Set the output filename with timestamp and build information +output_filename="${SCRIPT_DIR}/${DOCKER_IMAGE_TAG}_${formatted_build_info}.tar.gz" + +# Save the Docker container and gzip it +echo "Saving the container as ${output_filename}. This can take a while." +docker save "$DOCKER_IMAGE_TAG" | gzip -c > "$output_filename" + +echo "Container saved as ${output_filename}" \ No newline at end of file diff --git a/Task_1/do_test_run.sh b/Task_1/do_test_run.sh new file mode 100644 index 0000000..2120f32 --- /dev/null +++ b/Task_1/do_test_run.sh @@ -0,0 +1,78 @@ +#!/usr/bin/env bash + +# Stop at first error +set -e + +SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) +DOCKER_IMAGE_TAG="example-algorithm-sanity-check-task-1" + +# Check if an argument is provided +if [ "$#" -eq 1 ]; then + DOCKER_IMAGE_TAG="$1" +fi + +DOCKER_NOOP_VOLUME="${DOCKER_IMAGE_TAG}-volume" + +INPUT_DIR="${SCRIPT_DIR}/test/input" +OUTPUT_DIR="${SCRIPT_DIR}/test/output" + +echo "=+= (Re)build the container" +source "${SCRIPT_DIR}/do_build.sh" "$DOCKER_IMAGE_TAG" + +cleanup() { + echo "=+= Cleaning permissions ..." + # Ensure permissions are set correctly on the output + # This allows the host user (e.g. you) to access and handle these files + docker run --rm \ + --quiet \ + --volume "$OUTPUT_DIR":/output \ + --entrypoint /bin/sh \ + $DOCKER_IMAGE_TAG \ + -c "chmod -R -f o+rwX /output/* || true" +} + + +echo "=+= Cleaning up any earlier output" +if [ -d "$OUTPUT_DIR" ]; then + # Ensure permissions are setup correctly + # This allows for the Docker user to write to this location + rm -rf "${OUTPUT_DIR}"/* + chmod -f o+rwx "$OUTPUT_DIR" +else + mkdir -m o+rwx "$OUTPUT_DIR" +fi + +trap cleanup EXIT + +echo "=+= Doing a forward pass" +## Note the extra arguments that are passed here: +# '--network none' +# entails there is no internet connection +# 'gpus all' +# enables access to any GPUs present +# '--volume :/tmp' +# is added because on Grand Challenge this directory cannot be used to store permanent files +docker volume create "$DOCKER_NOOP_VOLUME" > /dev/null +docker run --rm \ + --platform=linux/amd64 \ + --network none \ + --gpus all \ + --volume "$INPUT_DIR":/input:ro \ + --volume "$OUTPUT_DIR":/output \ + --volume "$DOCKER_NOOP_VOLUME":/tmp \ + $DOCKER_IMAGE_TAG +docker volume rm "$DOCKER_NOOP_VOLUME" > /dev/null + +# Ensure permissions are set correctly on the output +# This allows the host user (e.g. you) to access and handle these files +docker run --rm \ + --quiet \ + --env HOST_UID=`id -u` \ + --env HOST_GID=`id -g` \ + --volume "$OUTPUT_DIR":/output \ + alpine:latest \ + /bin/sh -c 'chown -R ${HOST_UID}:${HOST_GID} /output' + +echo "=+= Wrote results to ${OUTPUT_DIR}" + +echo "=+= Save this image for uploading via ./do_save.sh \"${DOCKER_IMAGE_TAG}\"" \ No newline at end of file diff --git a/Task_1/inference.py b/Task_1/inference.py new file mode 100644 index 0000000..94b28ad --- /dev/null +++ b/Task_1/inference.py @@ -0,0 +1,136 @@ +""" +The following is a simple example algorithm. + +It is meant to run within a container. + +To run it locally, you can call the following bash script: + + ./do_test_run.sh + +This will start the inference and reads from ./test/input and outputs to ./test/output + +To save the container and prep it for upload to Grand-Challenge.org you can call: + + ./do_save.sh + +Any container that shows the same behavior will do, this is purely an example of how one COULD do it. + +Happy programming! +""" +from pathlib import Path +import json +from glob import glob +import SimpleITK +import numpy +import os +import numpy as np +from monai.transforms import SaveImageD + +INPUT_PATH = Path("/input") +OUTPUT_PATH = Path("/output") +RESOURCE_PATH = Path("resources") + +from resources.preprocess import resample_images, crop_neck_region_sitk, apply_monai_transforms +from resources.utils import load_model_from_checkpoint, arrays_to_tensor, run_inference +from resources.postprocess import prediction_to_original_space + +def run(): + # Read the input + ct_path = load_image_file_as_array( + location=INPUT_PATH / "images/ct", + ) + input_electronic_health_record = load_json_file( + location=INPUT_PATH / "ehr.json", + ) + pt_path = load_image_file_as_array( + location=INPUT_PATH / "images/pet", + ) + + # resample the images to 1mm isotropic resolution + ct, pt, bb= resample_images( + ct_path=ct_path, + pet_path=pt_path, + ) + # crop the images to the bounding box + ct_cropped, pet_cropped, box_start, box_end = crop_neck_region_sitk( + ct_sitk=ct, + pet_sitk=pt, + ) + + # apply transformation to the cropped images + ct_transformed, pet_transformed, meta = apply_monai_transforms(ct_cropped, pet_cropped) + input_tensor = arrays_to_tensor(ct_transformed, pet_transformed) + input_tensor = input_tensor.permute(0, 1, 3, 4, 2).contiguous() + + + # # Load the model and run inference + model_path = RESOURCE_PATH / "checkpoints" / "best_model.pth" + model, config = load_model_from_checkpoint(model_path, device="cuda") + + prediction = run_inference(model, input_tensor, config, device="cuda", use_sliding_window=True) + + pred_np = prediction.transpose(0, 3, 1, 2)[0] # → (310, 200, 200) + + + mask_orig = prediction_to_original_space( + pred_np, + meta, # from apply_monai_transforms + box_start, box_end, + ct, # 1-mm resampled CT + SimpleITK.ReadImage(ct_path), # original-resolution CT + ) + + + # Save your output + write_array_as_image_file( + location=OUTPUT_PATH / "images/tumor-lymph-node-segmentation", + array=mask_orig, + ) + return 0 + + +def load_json_file(*, location): + # Reads a json file + with open(location, "r") as f: + return json.loads(f.read()) + + +def load_image_file_as_array(*, location): + # Use SimpleITK to read a file + input_files = ( + glob(str(location / "*.tif")) + + glob(str(location / "*.tiff")) + + glob(str(location / "*.mha")) + ) + return input_files[0] + + +def write_array_as_image_file(*, location, array, filename="output.mha"): + location.mkdir(parents=True, exist_ok=True) + + if isinstance(array, SimpleITK.Image): + img = array # already SimpleITK + else: # assume NumPy + if array.ndim == 4 and array.shape[0] == 1: + array = array[0] # drop batch dim if present + img = SimpleITK.GetImageFromArray(array) + + SimpleITK.WriteImage(img, str(location / filename), useCompression=True) + + + +def _show_torch_cuda_info(): + import torch + + print("=+=" * 10) + print("Collecting Torch CUDA information") + print(f"Torch CUDA is available: {(available := torch.cuda.is_available())}") + if available: + print(f"\tnumber of devices: {torch.cuda.device_count()}") + print(f"\tcurrent device: { (current_device := torch.cuda.current_device())}") + print(f"\tproperties: {torch.cuda.get_device_properties(current_device)}") + print("=+=" * 10) + + +if __name__ == "__main__": + raise SystemExit(run()) diff --git a/Task_1/requirements.txt b/Task_1/requirements.txt new file mode 100644 index 0000000..459ad31 --- /dev/null +++ b/Task_1/requirements.txt @@ -0,0 +1,41 @@ +# Core ML libraries +numpy>=1.26.0 +scipy>=1.15.0 + +# Medical imaging and segmentation +monai>=1.5.0 +nibabel +SimpleITK>=2.5.0 +scikit-image>=0.25.0 + +# Neural network utilities +timm>=1.0.0 +nnunetv2>=2.6.0 + +# Data handling and processing +pandas>=2.2.0 +h5py>=3.14.0 +pyarrow>=19.0.0 + +# Image processing +imageio>=2.37.0 +tifffile>=2025.6.0 +pillow>=11.0.0 + +# Visualization +matplotlib>=3.10.0 +seaborn>=0.13.0 + +# Utilities +tqdm>=4.67.0 +pyyaml>=6.0.0 +fsspec>=2025.5.0 +joblib>=1.5.0 + +# Development and training +scikit-learn>=1.7.0 +psutil>=7.0.0 + + +# Configuration management +yacs>=0.1.8 \ No newline at end of file diff --git a/Task_1/resources/checkpoints/.gitkeep b/Task_1/resources/checkpoints/.gitkeep new file mode 100644 index 0000000..e69de29 diff --git a/Task_1/resources/configs/__init__.py b/Task_1/resources/configs/__init__.py new file mode 100644 index 0000000..03b97da --- /dev/null +++ b/Task_1/resources/configs/__init__.py @@ -0,0 +1,5 @@ +from .base_config import BaseConfig +from .segresnet_config import SegResNetConfig + + +__all__ = ['BaseConfig', 'SegResNetConfig'] diff --git a/Task_1/resources/configs/base_config.py b/Task_1/resources/configs/base_config.py new file mode 100644 index 0000000..eb6dbbe --- /dev/null +++ b/Task_1/resources/configs/base_config.py @@ -0,0 +1,68 @@ +"""Base configuration class for all models.""" + +import os +from dataclasses import dataclass +from typing import Tuple + + +@dataclass +class BaseConfig: + """Base configuration class with common parameters.""" + + # Data paths + data_root: str = "/path/to/hecktor2025_dataset" + train_images_dir: str = "imagesTr_resampled_cropped_npy" + train_labels_dir: str = "labelsTr_resampled_cropped_npy" + splits_file: str = "config/splits_final.json" + + # Data properties + input_channels: int = 2 # CT + PET + num_classes: int = 3 # background + primary tumor + metastatic tumor + spatial_size: Tuple[int, int, int] = (128, 128, 128) + + # Training parameters + batch_size: int = 2 + learning_rate: float = 1e-2 + weight_decay: float = 3e-5 + num_epochs: int = 350 + + # Scheduler parameters + # PolyLR scheduler parameters + poly_lr_power: float = 0.9 + poly_lr_min_lr: float = 1e-6 + + # Data augmentation + use_augmentation: bool = True + aug_probability: float = 0.5 + rotation_range: float = 15.0 + scaling_range: float = 0.1 + translation_range: float = 10.0 + + # System parameters + device: str = "cuda" + num_workers: int = 4 + pin_memory: bool = True + + # Data caching parameters + cache_rate: float = 0.25 # Cache 25% of training data for faster loading + + # Checkpointing and logging + save_checkpoint_every: int = 1 # Save checkpoint every n epochs + use_tensorboard: bool = True + + # Output directories + experiment_name: str = "baseline" + output_dir: str = "experiments" + fold: int = 0 + + def __post_init__(self): + """Setup output directories with fold-specific structure.""" + # Create fold-specific directory structure + self.experiment_dir = os.path.join(self.output_dir, self.experiment_name) + self.fold_dir = os.path.join(self.experiment_dir, f"fold_{self.fold}") + self.checkpoint_dir = os.path.join(self.fold_dir, "checkpoints") + self.log_dir = os.path.join(self.fold_dir, "logs") + + # Create directories + for dir_path in [self.experiment_dir, self.fold_dir, self.checkpoint_dir, self.log_dir]: + os.makedirs(dir_path, exist_ok=True) diff --git a/Task_1/resources/configs/segresnet_config.py b/Task_1/resources/configs/segresnet_config.py new file mode 100644 index 0000000..5cd361d --- /dev/null +++ b/Task_1/resources/configs/segresnet_config.py @@ -0,0 +1,23 @@ +"""SegResNet specific configuration.""" + +from dataclasses import dataclass +from .base_config import BaseConfig +from pathlib import Path + +# Define a temporary path for storing intermediate files +TMP_PATH= Path("/tmp") + +@dataclass +class SegResNetConfig(BaseConfig): + """Configuration for SegResNet model.""" + + # Model specific experiment name + experiment_name: str = TMP_PATH / "segresnet" + + # SegResNet architecture parameters + blocks_down: tuple = (1, 2, 2, 4) + blocks_up: tuple = (1, 1, 1) + init_filters: int = 16 + in_channels: int = 2 # CT + PET (same as input_channels in BaseConfig) + out_channels: int = 3 # background + primary tumor + metastatic tumor (same as num_classes in BaseConfig) + dropout_prob: float = 0.2 diff --git a/Task_1/resources/models/__init__.py b/Task_1/resources/models/__init__.py new file mode 100644 index 0000000..1b02b4f --- /dev/null +++ b/Task_1/resources/models/__init__.py @@ -0,0 +1,4 @@ +from .base_model import BaseModel +from .segresnet import SegResNetModel + +__all__ = ['BaseModel', 'SegResNetModel'] diff --git a/Task_1/resources/models/base_model.py b/Task_1/resources/models/base_model.py new file mode 100644 index 0000000..1dc5e7c --- /dev/null +++ b/Task_1/resources/models/base_model.py @@ -0,0 +1,64 @@ +"""Base model class for all segmentation models.""" + +from abc import ABC, abstractmethod +import torch +import torch.nn as nn +from typing import Dict, Any + + +class BaseModel(nn.Module, ABC): + """Abstract base class for all segmentation models.""" + + def __init__(self, config): + """ + Initialize base model. + + Args: + config: Configuration object + """ + super().__init__() + self.config = config + + @abstractmethod + def forward(self, x: torch.Tensor) -> torch.Tensor: + """ + Forward pass. + + Args: + x: Input tensor + + Returns: + Output tensor + """ + pass + + def get_parameters(self) -> Dict[str, Any]: + """Get model parameters summary.""" + total_params = sum(p.numel() for p in self.parameters()) + trainable_params = sum(p.numel() for p in self.parameters() if p.requires_grad) + + return { + "total_parameters": total_params, + "trainable_parameters": trainable_params, + "model_size_mb": total_params * 4 / (1024 * 1024), # Assuming float32 + } + + def save_checkpoint(self, path: str, epoch: int, optimizer_state: Dict = None, **kwargs): + """Save model checkpoint.""" + checkpoint = { + "epoch": epoch, + "model_state_dict": self.state_dict(), + "model_config": self.config.__dict__, + **kwargs + } + + if optimizer_state: + checkpoint["optimizer_state_dict"] = optimizer_state + + torch.save(checkpoint, path) + + def load_checkpoint(self, path: str, device: str = "cpu"): + """Load model checkpoint.""" + checkpoint = torch.load(path, map_location=device) + self.load_state_dict(checkpoint["model_state_dict"]) + return checkpoint diff --git a/Task_1/resources/models/segresnet.py b/Task_1/resources/models/segresnet.py new file mode 100644 index 0000000..eb83dc3 --- /dev/null +++ b/Task_1/resources/models/segresnet.py @@ -0,0 +1,80 @@ +"""SegResNet model implementation using MONAI.""" + +import torch +import torch.nn as nn +from monai.networks.nets import SegResNet + +from .base_model import BaseModel + + +class SegResNetModel(BaseModel): + """SegResNet model for segmentation using MONAI.""" + + def __init__(self, config): + """ + Initialize SegResNet model. + + Args: + config: SegResNetConfig object + """ + super().__init__(config) + + # Initialize MONAI SegResNet + self.segresnet = SegResNet( + blocks_down=config.blocks_down, + blocks_up=config.blocks_up, + init_filters=config.init_filters, + in_channels=config.input_channels, # Using input_channels from BaseConfig + out_channels=config.num_classes, # Using num_classes from BaseConfig + dropout_prob=config.dropout_prob + ) + + # Initialize weights + self._initialize_weights() + + def forward(self, x: torch.Tensor) -> torch.Tensor: + """ + Forward pass through SegResNet. + + Args: + x: Input tensor of shape (B, C, H, W, D) + + Returns: + Output tensor of shape (B, num_classes, H, W, D) + """ + return self.segresnet(x) + + def _initialize_weights(self): + """Initialize model weights.""" + for module in self.modules(): + if isinstance(module, (nn.Conv3d, nn.ConvTranspose3d)): + nn.init.kaiming_normal_(module.weight, mode="fan_out", nonlinearity="relu") + if module.bias is not None: + nn.init.constant_(module.bias, 0) + elif isinstance(module, nn.BatchNorm3d): + nn.init.constant_(module.weight, 1) + nn.init.constant_(module.bias, 0) + + def get_model_info(self) -> str: + """Get model architecture information.""" + params = self.get_parameters() + + info = f""" +SegResNet Model Information: +--------------------------- +Architecture: SegResNet (MONAI) +Input Channels: {self.config.input_channels} +Output Channels: {self.config.num_classes} +Init Filters: {self.config.init_filters} +Blocks Down: {self.config.blocks_down} +Blocks Up: {self.config.blocks_up} +Dropout Probability: {self.config.dropout_prob} + +Parameters: +----------- +Total Parameters: {params['total_parameters']:,} +Trainable Parameters: {params['trainable_parameters']:,} +Model Size: {params['model_size_mb']:.2f} MB + """ + + return info.strip() diff --git a/Task_1/resources/postprocess.py b/Task_1/resources/postprocess.py new file mode 100644 index 0000000..2a942b3 --- /dev/null +++ b/Task_1/resources/postprocess.py @@ -0,0 +1,57 @@ +import numpy as np +import SimpleITK as sitk + + +def prediction_to_original_space( + pred_np: np.ndarray, # (Z,Y,X) – network output, CH squeezed + meta: dict, # ct_proc.meta coming from apply_monai_transforms + box_start_xyz, box_end_xyz, # lists returned by crop_neck_region_sitk + resampled_ct_sitk: sitk.Image, # 1-mm CT fed to MONAI + original_ct_sitk: sitk.Image, # the scanner-resolution CT +): + """ + 1) remove symmetric Z-padding (computed from shape diff, no meta["padding"] needed) + 2) undo CropForegroundd (roi_start / roi_end in `meta`) + 3) undo neck ROI crop (box_start_xyz / box_end_xyz) + 4) resample mask from 1-mm space back to original CT space (nearest-neighbor) + + returns: SimpleITK.Image aligned with `original_ct_sitk` + """ + # ------------------------------------------------------- 1. undo SpatialPadd + Zt = box_end_xyz[2] - box_start_xyz[2] # target depth (e.g. 298) + z_extra = pred_np.shape[0] - Zt # 12 when 310→298 + if z_extra > 0: + z_trim = z_extra // 2 + pred_np = pred_np[z_trim:z_trim + Zt, ...] # (Zt,Y,X) + + # ------------------------------------------------------- 2. undo CropForegroundd + if "roi_start" in meta: # might be missing + z0,y0,x0 = meta["roi_start"] + z1,y1,x1 = meta["roi_end"] + canvas_cf = np.zeros((Zt, # already trimmed depth + y1-y0, + x1-x0), dtype=pred_np.dtype) + canvas_cf[z0:z1, y0:y1, x0:x1] = pred_np + else: + canvas_cf = pred_np # nothing to undo + + # ------------------------------------------------------- 3. undo neck ROI crop + x0,y0,z0 = box_start_xyz + x1,y1,z1 = box_end_xyz + Zr,Yr,Xr = resampled_ct_sitk.GetSize()[2], resampled_ct_sitk.GetSize()[1], resampled_ct_sitk.GetSize()[0] + canvas_roi = np.zeros((Zr, Yr, Xr), dtype=pred_np.dtype) + canvas_roi[z0:z1, y0:y1, x0:x1] = canvas_cf + + # ------------------------------------------------------- 4. resample to original CT + mask_1mm = sitk.GetImageFromArray(canvas_roi.astype(np.uint8)) + mask_1mm.CopyInformation(resampled_ct_sitk) + + mask_orig = sitk.Resample( + mask_1mm, # moving + original_ct_sitk, # reference + sitk.Transform(), # identity + sitk.sitkNearestNeighbor, + 0, + sitk.sitkUInt8, + ) + return mask_orig diff --git a/Task_1/resources/preprocess.py b/Task_1/resources/preprocess.py new file mode 100644 index 0000000..307e281 --- /dev/null +++ b/Task_1/resources/preprocess.py @@ -0,0 +1,243 @@ +import SimpleITK as sitk +import numpy as np +import torch +import warnings +from skimage.measure import label + +from monai.transforms import ( + Compose, + LoadImaged, + EnsureChannelFirstd, + Orientationd, + ScaleIntensityRanged, + CropForegroundd, + SpatialPadd, + NormalizeIntensityd, + # Sigmoid activation is included + Activationsd, +) +from monai.data import MetaTensor + + + +def sitk_to_metatensor(img_sitk: sitk.Image) -> MetaTensor: + """ + SimpleITK ➜ MetaTensor, channel-first [1, Z, Y, X], + *no* manual transpose. Direction/spacing/origin preserved. + """ + arr = sitk.GetArrayFromImage(img_sitk).astype(np.float32) # [Z, Y, X] + arr = arr[None, ...] # [1, Z, Y, X] + + spacing = img_sitk.GetSpacing() # (sx, sy, sz) + origin = img_sitk.GetOrigin() + direction = img_sitk.GetDirection() # 9-tuple row-major + + affine = np.eye(4, dtype=np.float64) + affine[:3, :3] = np.reshape(direction, (3, 3)) * spacing + affine[:3, 3] = origin + + meta = { + "spacing": spacing, + "origin": origin, + "direction": direction, + "affine": affine, + } + return MetaTensor(arr, meta=meta) + + +def get_bounding_boxes(ct_sitk, pet_sitk): + """ + Get the bounding boxes of the CT and PET images. + This works since all images have the same direction. + """ + ct_origin = np.array(ct_sitk.GetOrigin()) + pet_origin = np.array(pet_sitk.GetOrigin()) + + ct_position_max = ct_origin + np.array(ct_sitk.GetSize()) * np.array(ct_sitk.GetSpacing()) + pet_position_max = pet_origin + np.array(pet_sitk.GetSize()) * np.array(pet_sitk.GetSpacing()) + + return np.concatenate([ + np.maximum(ct_origin, pet_origin), + np.minimum(ct_position_max, pet_position_max), + ], axis=0) + +def resample_images(ct_path, pet_path): + """ + Resample CT and PET images to specified resolution using SimpleITK. + + Args: + ct_array: CT image as numpy array + pet_array: PET image as numpy array + + Returns: + Tuple of (resampled_ct_array, resampled_pet_array) + """ + resampling = [1, 1, 1] + resampler = sitk.ResampleImageFilter() + resampler.SetOutputDirection([1, 0, 0, 0, 1, 0, 0, 0, 1]) + resampler.SetOutputSpacing(resampling) + ct = sitk.ReadImage(ct_path) + pt = sitk.ReadImage(pet_path) + bb = get_bounding_boxes(ct, pt) + size = np.round((bb[3:] - bb[:3]) / resampling).astype(int) + resampler.SetOutputOrigin(bb[:3]) + resampler.SetSize([int(k) for k in size]) + resampler.SetInterpolator(sitk.sitkBSpline) + ct = resampler.Execute(ct) + pt = resampler.Execute(pt) + + return ct,pt, bb + +def get_roi_center(pet_tensor, z_top_fraction=0.75, z_score_threshold=1.0): + """ + Calculates the center of the largest high-intensity region in the top part of the PET scan. + """ + # 1. Isolate top of the scan based on the z-axis + image_shape_voxels = np.array(pet_tensor.shape) + crop_z_start = int(z_top_fraction * image_shape_voxels[2]) + top_of_scan = pet_tensor[..., crop_z_start:] + + # 2. Threshold to find high-intensity regions (potential brain/tumor) + # Using a small epsilon to avoid division by zero in blank images + mask = ((top_of_scan - top_of_scan.mean()) / (top_of_scan.std() + 1e-8)) > z_score_threshold + + if not mask.any(): + # If no pixels are above the threshold, fall back to the geometric center of the top part + warnings.warn("No high-intensity region found. Using geometric center of the upper scan region.") + center_in_top = (np.array(top_of_scan.shape) / 2).astype(int) + else: + # Find the largest connected component to remove noise + labeled_mask, num_features = label(mask, return_num=True, connectivity=3) + if num_features > 0: + component_sizes = np.bincount(labeled_mask.ravel())[1:] # ignore background + largest_component_label = np.argmax(component_sizes) + 1 + largest_component_mask = labeled_mask == largest_component_label + comp_idx = np.argwhere(largest_component_mask) + else: # Should not happen if mask.any() is true, but as a safeguard + comp_idx = np.argwhere(mask) + + # 3. Calculate the centroid of the largest component + center_in_top = np.mean(comp_idx, axis=0) + + # 4. Adjust center to be in the original full-image coordinate system + center_full_image = center_in_top + np.array([0, 0, crop_z_start]) + return center_full_image.astype(int) + +def crop_neck_region_sitk( + ct_sitk: sitk.Image, + pet_sitk: sitk.Image, + crop_box_size=(200, 200, 310), + z_top_fraction=0.75, + z_score_threshold=1.0, +): + # ------------------------------------------------------------------ + # 1. Convert PET to numpy for ROI-finding (SimpleITK gives z,y,x order) + # ------------------------------------------------------------------ + pet_np_zyx = sitk.GetArrayFromImage(pet_sitk) # [z, y, x] + pet_np_xyz = np.transpose(pet_np_zyx, (2, 1, 0)) # [x, y, z] + pet_tensor = torch.from_numpy(pet_np_xyz).float() + + # ------------------------------------------------------------------ + # 2. Determine the crop centre and bounding box in voxel coordinates + # ------------------------------------------------------------------ + crop_box_size = np.asarray(crop_box_size, dtype=int) + center = get_roi_center(pet_tensor, + z_top_fraction=z_top_fraction, + z_score_threshold=z_score_threshold) + + img_shape = np.asarray(pet_np_xyz.shape) + box_start = np.clip(center - crop_box_size // 2, 0, img_shape) + box_end = np.clip(box_start + crop_box_size, 0, img_shape) + + # Guard in case image is smaller than requested box + box_start = np.maximum(box_end - crop_box_size, 0) + + # SimpleITK wants index & size in (x, y, z) order + index = [int(i) for i in box_start] + size = [int(e - s) for s, e in zip(box_start, box_end)] + + # ------------------------------------------------------------------ + # 3. Crop with RegionOfInterest (origin adjusted automatically) + # ------------------------------------------------------------------ + ct_crop = sitk.RegionOfInterest(ct_sitk, size=size, index=index) + pet_crop = sitk.RegionOfInterest(pet_sitk, size=size, index=index) + + return ct_crop, pet_crop, box_start, box_end + +def get_preprocessing_transforms(keys, final_size=(200, 200, 310)): + """ + Defines the sequence of deterministic transforms to be applied to each case. + This version includes Sigmoid activation for CT and PET. + + Args: + keys (list): List of keys ('ct', 'pet', 'label') to apply transforms to. + final_size (tuple): The final spatial size to pad the images to after cropping. + + Returns: + monai.transforms.Compose: The composition of all preprocessing transforms. + """ + return Compose([ + # 3. Reorient all images to a standard 'RAS' orientation + Orientationd(keys=keys, axcodes="RAS"), + + # 4. Normalize images and apply sigmoid + # 4a. Re-scale CT intensity to [0, 1] range. + ScaleIntensityRanged( + keys=["ct"], a_min=-250, a_max=250, b_min=-6.0, b_max=6.0, clip=True + ), + # 4b. Normalize PET to zero mean, unit variance. + NormalizeIntensityd(keys=["pet"], nonzero=True, channel_wise=True), + + # # ========================================================================= + # # CONFIRMATION: Sigmoid is applied here to CT and PET as a soft clamp. + # # ========================================================================= + # Activationsd(keys=["ct", "pet"], sigmoid=True), + + # 5. Crop away empty background based on CT and then pad all to a uniform size. + CropForegroundd(keys=keys, source_key="ct", allow_smaller=True), + SpatialPadd(keys=keys, spatial_size=final_size, method="end"), + ]) + +def apply_monai_transforms(ct_sitk: sitk.Image, + pt_sitk: sitk.Image, + final_size = (310, 200, 200)): + """ + Run the deterministic MONAI preprocessing on in-memory SimpleITK volumes. + + Parameters + ---------- + ct_sitk, pt_sitk : sitk.Image + Aligned, resampled CT / PET volumes. + final_size : tuple[int, int, int] + Target padded size (passed through to get_preprocessing_transforms). + + Returns + ------- + ct_t, pet_t : monai.data.MetaTensor (shape = [C, H, W, D]) + meta : dict (spacing, origin, direction, â€Ļ) + """ + # ------------------------------------------------------------------ + # 1. Wrap into MetaTensors so MONAI keeps spatial metadata + # ------------------------------------------------------------------ + ct_mt = sitk_to_metatensor(ct_sitk) + + pet_mt = sitk_to_metatensor(pt_sitk) + data = {"ct": ct_mt, "pet": pet_mt} + + + # ------------------------------------------------------------------ + # 2. Build the preprocessing Compose + # ------------------------------------------------------------------ + xforms = get_preprocessing_transforms(keys=["ct", "pet"], + final_size=final_size) + out = xforms(data) + # ------------------------------------------------------------------ + # 3. Execute transforms + # ------------------------------------------------------------------ + ct_proc = out["ct"] # MetaTensor, 1×H×W×D + pet_proc = out["pet"] + meta = ct_proc.meta # <— this is the live dict + + + return ct_proc, pet_proc, meta + diff --git a/Task_1/resources/utils.py b/Task_1/resources/utils.py new file mode 100644 index 0000000..0aeedd8 --- /dev/null +++ b/Task_1/resources/utils.py @@ -0,0 +1,179 @@ +from resources.configs import SegResNetConfig +from resources.models import SegResNetModel +import torch +import SimpleITK as sitk +import os +import numpy as np +from monai.inferers import sliding_window_inference + + +def load_model_from_checkpoint(checkpoint_path, device='cuda'): + """Load model from checkpoint with proper architecture reconstruction.""" + print(f"Loading checkpoint from: {checkpoint_path}") + + if not os.path.exists(checkpoint_path): + print(f"Error: Checkpoint file not found: {checkpoint_path}") + return None, None + + try: + # Load checkpoint + checkpoint = torch.load(checkpoint_path, map_location=device) + + # Extract model config + if 'model_config' not in checkpoint: + print("Error: No model_config found in checkpoint") + return None, None + + model_config = checkpoint['model_config'] + print(f"Found model config for experiment: {model_config.get('experiment_name', 'unknown')}") + + # Create config object from saved dictionary + experiment_name = model_config.get('experiment_name', 'unet3d') + + # Recreate the config object + config = SegResNetConfig() + + + + # Update config with saved values + for key, value in model_config.items(): + if hasattr(config, key): + setattr(config, key, value) + + print(f"Recreated config for {experiment_name}") + print(f" Input channels: {config.input_channels}") + print(f" Num classes: {config.num_classes}") + print(f" Spatial size: {config.spatial_size}") + + + model = SegResNetModel(config) + + # Load the state dict + model.load_state_dict(checkpoint['model_state_dict']) + + # Move to device and set eval mode + model = model.to(device) + model.eval() + + print(f"✓ Model loaded successfully!") + print(f" Model type: {type(model).__name__}") + print(f" Parameters: {sum(p.numel() for p in model.parameters()):,}") + + return model, config + + except Exception as e: + print(f"Error loading model: {e}") + import traceback + traceback.print_exc() + return None, None + +def arrays_to_tensor(ct_vol, pet_vol): + """ + Cast CT & PET volumes (of several possible types) into + a single 5-D torch tensor [1, 2, Z, Y, X]. + + Accepts: + â€ĸ SimpleITK.Image (assumed Z-Y-X grid) + â€ĸ np.ndarray (Z, Y, X) OR (1, Z, Y, X) + â€ĸ torch.Tensor / MetaTensor (1, Z, Y, X) + """ + + def to_tensor(vol): + # --- SimpleITK -> torch ----------------------------------------- + if isinstance(vol, sitk.Image): + vol = torch.from_numpy(sitk.GetArrayFromImage(vol)) # Z,Y,X + vol = vol.unsqueeze(0) # 1,Z,Y,X + return vol.float() + + # --- NumPy -> torch --------------------------------------------- + if isinstance(vol, np.ndarray): + vol = torch.from_numpy(vol) + if vol.ndim == 3: # Z,Y,X + vol = vol.unsqueeze(0) # 1,Z,Y,X + return vol.float() + + # --- torch / MetaTensor ----------------------------------------- + if isinstance(vol, torch.Tensor): + if vol.ndim == 3: # Z,Y,X (rare case) + vol = vol.unsqueeze(0) # 1,Z,Y,X + return vol.float() + + raise TypeError(f"Unsupported type {type(vol)}") + + ct_t = to_tensor(ct_vol) + pet_t = to_tensor(pet_vol) + + # Shape check: both must now be (1, Z, Y, X) + if ct_t.shape[1:] != pet_t.shape[1:]: + raise ValueError(f"Shape mismatch: CT {ct_t.shape} vs PET {pet_t.shape}") + + combined = torch.cat([ct_t, pet_t], dim=0) # (2, Z, Y, X) + return combined.unsqueeze(0) # (1, 2, Z, Y, X) + + + + +def run_inference(model, input_tensor, config, device='cuda', use_sliding_window=True): + """Run inference with the loaded model using sliding window.""" + print(f"Input tensor shape: {input_tensor.shape}") + + try: + # Move to device + input_tensor = input_tensor.to(device) + + # Run inference + if use_sliding_window: + print("Running sliding window inference...") + print(f" ROI size: {config.spatial_size}") + print(f" SW batch size: 4") + print(f" Overlap: 0.5") + + with torch.no_grad(): + output = sliding_window_inference( + inputs=input_tensor, + roi_size=config.spatial_size, + sw_batch_size=4, + predictor=model, + overlap=0.5, + mode="gaussian", + sigma_scale=0.125, + padding_mode="constant", + cval=0.0, + sw_device=device, + ) + else: + print("Running direct inference...") + with torch.no_grad(): + output = model(input_tensor) + + print(f"Output shape: {output.shape}") + print(f"Output min/max: {output.min().item():.4f} / {output.max().item():.4f}") + + # Apply softmax and get prediction + if output.shape[1] > 1: # Multi-class + probs = torch.softmax(output, dim=1) + prediction = probs.argmax(dim=1) + print(f"Using softmax + argmax for {output.shape[1]} classes") + else: # Binary with sigmoid + probs = torch.sigmoid(output) + prediction = (probs > 0.5).long() + print("Using sigmoid for binary classification") + + print(f"Prediction shape: {prediction.shape}") + unique_vals = torch.unique(prediction).tolist() + print(f"Unique values in prediction: {unique_vals}") + + # Calculate class distribution + total_voxels = prediction.numel() + for val in unique_vals: + count = (prediction == val).sum().item() + percentage = (count / total_voxels) * 100 + print(f" Class {val}: {count:,} voxels ({percentage:.2f}%)") + + return prediction.cpu().numpy() + + except Exception as e: + print(f"Error during inference: {e}") + import traceback + traceback.print_exc() + return None diff --git a/Task_1/test/.gitignore b/Task_1/test/.gitignore new file mode 100644 index 0000000..ea1472e --- /dev/null +++ b/Task_1/test/.gitignore @@ -0,0 +1 @@ +output/ diff --git a/Task_1/test/input/ehr.json b/Task_1/test/input/ehr.json new file mode 100644 index 0000000..8252151 --- /dev/null +++ b/Task_1/test/input/ehr.json @@ -0,0 +1,9 @@ +{ + "Age": 71, + "Gender": 1, + "Tobacco Consumption": 0, + "Alcohol Consumption": 1, + "Performance Status": 0, + "Treatment": 1, + "M-stage": "M0" +} \ No newline at end of file diff --git a/Task_1/test/input/images/ct/ade521d4-a145-44d7-b750-f0f2a1ef658c.mha b/Task_1/test/input/images/ct/ade521d4-a145-44d7-b750-f0f2a1ef658c.mha new file mode 100644 index 0000000..2feed32 Binary files /dev/null and b/Task_1/test/input/images/ct/ade521d4-a145-44d7-b750-f0f2a1ef658c.mha differ diff --git a/Task_1/test/input/images/pet/b6b1cc0a-de79-4c5d-a07c-988798b5fe25.mha b/Task_1/test/input/images/pet/b6b1cc0a-de79-4c5d-a07c-988798b5fe25.mha new file mode 100644 index 0000000..2feed32 Binary files /dev/null and b/Task_1/test/input/images/pet/b6b1cc0a-de79-4c5d-a07c-988798b5fe25.mha differ diff --git a/Task_2/.gitattributes b/Task_2/.gitattributes new file mode 100644 index 0000000..e69de29 diff --git a/Task_2/.gitignore b/Task_2/.gitignore new file mode 100644 index 0000000..4b43e87 --- /dev/null +++ b/Task_2/.gitignore @@ -0,0 +1,4 @@ +test/input/images/*/*.mha +resources/clinical_preprocessors.pkl +resources/ensemble_model.pt +Task_2/*.tar.gz \ No newline at end of file diff --git a/Task_2/Dockerfile b/Task_2/Dockerfile new file mode 100644 index 0000000..2aa0879 --- /dev/null +++ b/Task_2/Dockerfile @@ -0,0 +1,27 @@ +# Use a 'large' base container to show-case how to load pytorch (macOS) +# FROM --platform=linux/arm64 pytorch/pytorch AS example-task2-arm64 + +# Use a 'large' base container to show-case how to load pytorch and use the GPU (when enabled) (Linux and WSL) +FROM --platform=linux/amd64 pytorch/pytorch:2.6.0-cuda12.4-cudnn9-runtime AS example-task2-amd64 + +# Ensures that Python output to stdout/stderr is not buffered: prevents missing information when terminating +ENV PYTHONUNBUFFERED=1 + +RUN groupadd -r user && useradd -m --no-log-init -r -g user user +USER user + +WORKDIR /opt/app + +COPY --chown=user:user requirements.txt /opt/app/ +COPY --chown=user:user resources /opt/app/resources + +# You can add any Python dependencies to requirements.txt +RUN python -m pip install \ + --user \ + --no-cache-dir \ + --no-color \ + --requirement /opt/app/requirements.txt + +COPY --chown=user:user inference.py /opt/app/ + +ENTRYPOINT ["python", "inference.py"] diff --git a/Task_2/do_build.sh b/Task_2/do_build.sh new file mode 100755 index 0000000..2166012 --- /dev/null +++ b/Task_2/do_build.sh @@ -0,0 +1,22 @@ +#!/usr/bin/env bash + +# Stop at first error +set -e + +SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) +DOCKER_IMAGE_TAG="example-algorithm-sanity-check-task-2" + + +# Check if an argument is provided +if [ "$#" -eq 1 ]; then + DOCKER_IMAGE_TAG="$1" +fi + +# Build the container +docker build "$SCRIPT_DIR" \ + --tag "$DOCKER_IMAGE_TAG" 2>&1 + +# Build the Container when developing with macOS +# docker build "$SCRIPT_DIR" \ +# --platform=linux/arm64/v8 \ +# --tag "$DOCKER_IMAGE_TAG" 2>&1 \ No newline at end of file diff --git a/Task_2/do_save.sh b/Task_2/do_save.sh new file mode 100755 index 0000000..3986858 --- /dev/null +++ b/Task_2/do_save.sh @@ -0,0 +1,37 @@ +#!/usr/bin/env bash + +# Stop at first error +set -e + +SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) + +# Set default container name +DOCKER_IMAGE_TAG="example-algorithm-sanity-check-task-2" + +# Check if an argument is provided +if [ "$#" -eq 1 ]; then + DOCKER_IMAGE_TAG="$1" +fi + +echo "=+= (Re)build the container" +source "${SCRIPT_DIR}/do_build.sh" "$DOCKER_IMAGE_TAG" + +# Get the build information from the Docker image tag +build_timestamp=$( docker inspect --format='{{ .Created }}' "$DOCKER_IMAGE_TAG") + +if [ -z "$build_timestamp" ]; then + echo "Error: Failed to retrieve build information for container $DOCKER_IMAGE_TAG" + exit 1 +fi + +# Format the build information to remove special characters +formatted_build_info=$(echo $build_timestamp | sed -E 's/(.*)T(.*)\..*Z/\1_\2/' | sed 's/[-,:]/-/g') + +# Set the output filename with timestamp and build information +output_filename="${SCRIPT_DIR}/${DOCKER_IMAGE_TAG}_${formatted_build_info}.tar.gz" + +# Save the Docker container and gzip it +echo "Saving the container as ${output_filename}. This can take a while." +docker save "$DOCKER_IMAGE_TAG" | gzip -c > "$output_filename" + +echo "Container saved as ${output_filename}" \ No newline at end of file diff --git a/Task_2/do_test_run.sh b/Task_2/do_test_run.sh new file mode 100755 index 0000000..75e9346 --- /dev/null +++ b/Task_2/do_test_run.sh @@ -0,0 +1,80 @@ +#!/usr/bin/env bash + +# Stop at first error +set -e + +SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) +DOCKER_IMAGE_TAG="example-algorithm-sanity-check-task-2" + +# Check if an argument is provided +if [ "$#" -eq 1 ]; then + DOCKER_IMAGE_TAG="$1" +fi + +DOCKER_NOOP_VOLUME="${DOCKER_IMAGE_TAG}-volume" + +INPUT_DIR="${SCRIPT_DIR}/test/input" +OUTPUT_DIR="${SCRIPT_DIR}/test/output" + +echo "=+= (Re)build the container" +source "${SCRIPT_DIR}/do_build.sh" "$DOCKER_IMAGE_TAG" + +cleanup() { + echo "=+= Cleaning permissions ..." + # Ensure permissions are set correctly on the output + # This allows the host user (e.g. you) to access and handle these files + docker run --rm \ + --quiet \ + --volume "$OUTPUT_DIR":/output \ + --entrypoint /bin/sh \ + $DOCKER_IMAGE_TAG \ + -c "chmod -R -f o+rwX /output/* || true" +} + + +echo "=+= Cleaning up any earlier output" +if [ -d "$OUTPUT_DIR" ]; then + # Ensure permissions are setup correctly + # This allows for the Docker user to write to this location + rm -rf "${OUTPUT_DIR}"/* + chmod -f o+rwx "$OUTPUT_DIR" +else + mkdir -m o+rwx "$OUTPUT_DIR" +fi + +trap cleanup EXIT + +echo "=+= Doing a forward pass" +## Note the extra arguments that are passed here: +# '--network none' +# entails there is no internet connection +# 'gpus all' +# enables access to any GPUs present +# '--volume :/tmp' +# is added because on Grand Challenge this directory cannot be used to store permanent files +docker volume create "$DOCKER_NOOP_VOLUME" > /dev/null +docker run --rm \ + --network none \ + --gpus all \ + --volume "$INPUT_DIR":/input:ro \ + --volume "$OUTPUT_DIR":/output \ + --volume "$DOCKER_NOOP_VOLUME":/tmp \ + $DOCKER_IMAGE_TAG +docker volume rm "$DOCKER_NOOP_VOLUME" > /dev/null + + + + +# Ensure permissions are set correctly on the output +# This allows the host user (e.g. you) to access and handle these files +docker run --rm \ + --quiet \ + --env HOST_UID=`id -u` \ + --env HOST_GID=`id -g` \ + --volume "$OUTPUT_DIR":/output \ + alpine:latest \ + /bin/sh -c 'chown -R ${HOST_UID}:${HOST_GID} /output' + +echo "=+= Wrote results to ${OUTPUT_DIR}" + +echo "=+= Save this image for uploading via ./do_save.sh \"${DOCKER_IMAGE_TAG}\"" \ No newline at end of file diff --git a/Task_2/inference.py b/Task_2/inference.py new file mode 100644 index 0000000..47a896c --- /dev/null +++ b/Task_2/inference.py @@ -0,0 +1,119 @@ +""" +HECKTOR Survival Model Inference for Container Deployment +This script processes a single patient and outputs recurrence-free survival prediction. +""" + +from pathlib import Path +import json +from glob import glob +import SimpleITK +import numpy as np +import torch +from resources.utils import HecktorInferenceModel +import os + +INPUT_PATH = Path("/input") +OUTPUT_PATH = Path("/output") +RESOURCE_PATH = Path("resources") + +def run(): + """Main inference function.""" + # Read the input + input_pet_image = load_image_file_as_array( + location=INPUT_PATH / "images/pet", + ) + input_ct_image = load_image_file_as_array( + location=INPUT_PATH / "images/ct", + ) + input_electronic_health_record = load_json_file( + location=INPUT_PATH / "ehr.json", + ) + + # We don't use these for our model, but they're part of the interface + try: + ct_planning = load_image_file_as_array( + location=INPUT_PATH / "images/ct-planning", + ) + + except FileNotFoundError: + print(f"No image files found in {INPUT_PATH}/images/ct-planning") + ct_planning = None + + try: + rt_dose_map = load_image_file_as_array( + location=INPUT_PATH / "images/rt-dose", + ) + except FileNotFoundError: + print(f"No image files found in {INPUT_PATH}/images/rt-dose") + rt_dose_map = None + + # Show torch info + show_torch_cuda_info() + + try: + # Initialize the inference model + print("Loading HECKTOR survival model...") + model = HecktorInferenceModel(resource_path=RESOURCE_PATH) + + # Process the inputs and generate prediction + print("Processing patient data...") + output_recurrence_free_survival = model.predict_single_patient( + ct_image=input_ct_image, + pet_image=input_pet_image, + clinical_data=input_electronic_health_record + ) + + print(f"Predicted RFS risk score: {output_recurrence_free_survival:.6f}") + + except Exception as e: + print(f"Error during inference: {e}") + raise e + + # Save the output + write_json_file( + location=OUTPUT_PATH / "rfs.json", + content=float(output_recurrence_free_survival) + ) + + return 0 + +def load_json_file(*, location): + """Reads a json file.""" + with open(location, "r") as f: + return json.loads(f.read()) + +def write_json_file(*, location, content): + """Writes a json file.""" + with open(location, "w") as f: + f.write(json.dumps(content, indent=4)) + +def load_image_file_as_array(*, location): + """Use SimpleITK to read a file.""" + input_files = ( + glob(str(location / "*.tif")) + + glob(str(location / "*.tiff")) + + glob(str(location / "*.mha")) + + glob(str(location / "*.nii")) + + glob(str(location / "*.nii.gz")) + ) + + if not input_files: + raise FileNotFoundError(f"No image files found in {location}") + + result = SimpleITK.ReadImage(input_files[0]) + # Convert it to a Numpy array + return SimpleITK.GetArrayFromImage(result) + +def show_torch_cuda_info(): + """Display PyTorch CUDA information.""" + print("=+=" * 10) + print("Collecting Torch CUDA information") + print(f"Torch CUDA is available: {(available := torch.cuda.is_available())}") + if available: + print(f"\tnumber of devices: {torch.cuda.device_count()}") + print(f"\tcurrent device: {(current_device := torch.cuda.current_device())}") + print(f"\tproperties: {torch.cuda.get_device_properties(current_device)}") + print("=+=" * 10) + +if __name__ == "__main__": + raise SystemExit(run()) \ No newline at end of file diff --git a/Task_2/requirements.txt b/Task_2/requirements.txt new file mode 100644 index 0000000..e66a6f8 --- /dev/null +++ b/Task_2/requirements.txt @@ -0,0 +1,11 @@ +monai==1.5.0 +SimpleITK>=2.0.0 +numpy==2.0.2 +scipy>=1.7.0 +scikit-learn==1.5.2 +pandas>=1.3.0 +nibabel>=3.2.0 +scikit-survival>=0.2.0 +icare +optree>=0.13.0 +scikit-image>=0.19.0 \ No newline at end of file diff --git a/Task_2/resources/utils.py b/Task_2/resources/utils.py new file mode 100644 index 0000000..0a31014 --- /dev/null +++ b/Task_2/resources/utils.py @@ -0,0 +1,532 @@ +""" +HECKTOR Survival Model Utilities for Container Deployment +""" + +import os +import numpy as np +import pandas as pd +import torch +import pickle +import warnings +import SimpleITK as sitk +from pathlib import Path +from monai.transforms import ( + Compose, EnsureChannelFirst, ScaleIntensity, ToTensor, Resize +) +import torch.nn as nn +from monai.networks.nets import resnet18 +from monai.data import MetaTensor +from monai.utils.misc import ImageMetaKey +from skimage.measure import label +import math + + +class FusedFeatureExtractor(nn.Module): + """ + Feature extractor specifically designed for BaggedIcareSurvival. + Combines 3D medical imaging and clinical data into rich survival features. + """ + def __init__(self, clinical_feature_dim, feature_output_dim=128): + super().__init__() + + # Store dimensions for saving + self.clinical_feature_dim = clinical_feature_dim + self.feature_output_dim = feature_output_dim + + # 3D ResNet-18 for combined CT+PET input + self.imaging_backbone = resnet18( + spatial_dims=3, + n_input_channels=2, + num_classes=1, + ) + self.imaging_backbone.fc = nn.Identity() + + # Clinical data processor with deeper architecture + self.clinical_processor = nn.Sequential( + nn.Linear(clinical_feature_dim, 64), + nn.ReLU(), + nn.BatchNorm1d(64), + nn.Dropout(0.3), + nn.Linear(64, 64), + nn.ReLU(), + nn.BatchNorm1d(64), + nn.Dropout(0.2), + nn.Linear(64, 32), + nn.ReLU() + ) + + # Feature fusion with multiple pathways + self.feature_fusion = nn.Sequential( + nn.Linear(512 + 32, 512), + nn.ReLU(), + nn.BatchNorm1d(512), + nn.Dropout(0.4), + nn.Linear(512, 256), + nn.ReLU(), + nn.BatchNorm1d(256), + nn.Dropout(0.3), + nn.Linear(256, feature_output_dim) + ) + + # Risk prediction head for training guidance + self.risk_head = nn.Sequential( + nn.Linear(feature_output_dim, 64), + nn.ReLU(), + nn.Dropout(0.2), + nn.Linear(64, 1) + ) + + def forward(self, medical_images, clinical_features, return_risk=False): + # Extract imaging features + imaging_features = self.imaging_backbone(medical_images) + + # Process clinical features + clinical_features_processed = self.clinical_processor(clinical_features) + + # Combine and fuse + combined_features = torch.cat([imaging_features, clinical_features_processed], dim=1) + fused_features = self.feature_fusion(combined_features) + + if return_risk: + risk_scores = self.risk_head(fused_features).squeeze(-1) + return fused_features, risk_scores + + return fused_features + + +class HecktorInferenceModel: + """ + HECKTOR survival prediction model for container inference. + Processes single patients and returns RFS risk predictions. + """ + + def __init__(self, resource_path, resampling=(1.0, 1.0, 1.0), crop_box_size=[200, 200, 310]): + """ + Initialize the inference model. + + Args: + resource_path: Path to resources directory containing model files + resampling: Tuple of (x,y,z) resampling resolution in mm + crop_box_size: List of [x,y,z] crop box size in mm + """ + self.resource_path = Path(resource_path) + self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') + self.resampling = resampling + self.crop_box_size = np.array(crop_box_size) + + # Model components + self.ensemble_data = None + self.fold_models = [] + self.clinical_preprocessors = None + + # Image preprocessing transforms (applied after resampling and cropping) + self.image_transforms = self._create_image_transforms() + + self._load_model_components() + + def _create_image_transforms(self): + """Create image preprocessing transforms.""" + return Compose([ + EnsureChannelFirst(channel_dim="no_channel"), + ScaleIntensity(), + Resize(spatial_size=(96, 96, 96)), + ToTensor() + ]) + + def _get_bounding_boxes(self, ct_sitk, pet_sitk): + """ + Get the bounding boxes of the CT and PET images. + This works since all images have the same direction. + """ + ct_origin = np.array(ct_sitk.GetOrigin()) + pet_origin = np.array(pet_sitk.GetOrigin()) + + ct_position_max = ct_origin + np.array(ct_sitk.GetSize()) * np.array(ct_sitk.GetSpacing()) + pet_position_max = pet_origin + np.array(pet_sitk.GetSize()) * np.array(pet_sitk.GetSpacing()) + + return np.concatenate([ + np.maximum(ct_origin, pet_origin), + np.minimum(ct_position_max, pet_position_max), + ], axis=0) + + def _resample_images(self, ct_array, pet_array): + """ + Resample CT and PET images to specified resolution using SimpleITK. + + Args: + ct_array: CT image as numpy array + pet_array: PET image as numpy array + + Returns: + Tuple of (resampled_ct_array, resampled_pet_array) + """ + # Convert numpy arrays to SimpleITK images + ct_sitk = sitk.GetImageFromArray(ct_array) + pet_sitk = sitk.GetImageFromArray(pet_array) + + # Set default spacing and origin if not present + ct_sitk.SetSpacing([1.0, 1.0, 1.0]) + pet_sitk.SetSpacing([1.0, 1.0, 1.0]) + ct_sitk.SetOrigin([0.0, 0.0, 0.0]) + pet_sitk.SetOrigin([0.0, 0.0, 0.0]) + + # Get bounding box for both modalities + bb = self._get_bounding_boxes(ct_sitk, pet_sitk) + size = np.round((bb[3:] - bb[:3]) / np.array(self.resampling)).astype(int) + + # Set up resampler + resampler = sitk.ResampleImageFilter() + resampler.SetOutputDirection([1, 0, 0, 0, 1, 0, 0, 0, 1]) + resampler.SetOutputSpacing(self.resampling) + resampler.SetOutputOrigin(bb[:3]) + resampler.SetSize([int(k) for k in size]) + + # Resample CT with B-spline interpolation + resampler.SetInterpolator(sitk.sitkBSpline) + ct_resampled = resampler.Execute(ct_sitk) + + # Resample PET with B-spline interpolation + pet_resampled = resampler.Execute(pet_sitk) + + # Convert back to numpy arrays + ct_resampled_array = sitk.GetArrayFromImage(ct_resampled) + pet_resampled_array = sitk.GetArrayFromImage(pet_resampled) + + return ct_resampled_array, pet_resampled_array + + def _get_roi_center(self, pet_tensor, z_top_fraction=0.75, z_score_threshold=1.0): + """ + Calculates the center of the largest high-intensity region in the top part of the PET scan. + """ + # 1. Isolate top of the scan based on the z-axis + image_shape_voxels = np.array(pet_tensor.shape) + crop_z_start = int(z_top_fraction * image_shape_voxels[2]) + top_of_scan = pet_tensor[..., crop_z_start:] + + # 2. Threshold to find high-intensity regions (potential brain/tumor) + # Using a small epsilon to avoid division by zero in blank images + mask = ((top_of_scan - top_of_scan.mean()) / (top_of_scan.std() + 1e-8)) > z_score_threshold + + if not mask.any(): + # If no pixels are above the threshold, fall back to the geometric center of the top part + warnings.warn("No high-intensity region found. Using geometric center of the upper scan region.") + center_in_top = (np.array(top_of_scan.shape) / 2).astype(int) + else: + # Find the largest connected component to remove noise + labeled_mask, num_features = label(mask, return_num=True, connectivity=3) + if num_features > 0: + component_sizes = np.bincount(labeled_mask.ravel())[1:] # ignore background + largest_component_label = np.argmax(component_sizes) + 1 + largest_component_mask = labeled_mask == largest_component_label + comp_idx = np.argwhere(largest_component_mask) + else: # Should not happen if mask.any() is true, but as a safeguard + comp_idx = np.argwhere(mask) + + # 3. Calculate the centroid of the largest component + center_in_top = np.mean(comp_idx, axis=0) + + # 4. Adjust center to be in the original full-image coordinate system + center_full_image = center_in_top + np.array([0, 0, crop_z_start]) + return center_full_image.astype(int) + + def _crop_neck_region(self, ct_array, pet_array): + """ + Crop the head and neck region from CT and PET images. + + Args: + ct_array: CT image as numpy array + pet_array: PET image as numpy array + + Returns: + Tuple of (cropped_ct_array, cropped_pet_array) + """ + # Convert to torch tensor for processing + pet_tensor = torch.from_numpy(pet_array).float() + + # Get box size in voxels (assuming 1mm spacing after resampling) + box_size_voxels = self.crop_box_size.astype(int) + + # 1. Find the robust center of the ROI using PET + center_voxels = self._get_roi_center(pet_tensor) + + # 2. Calculate crop box and handle boundaries safely + image_shape_voxels = np.array(pet_array.shape) + box_start = center_voxels - box_size_voxels // 2 + box_end = box_start + box_size_voxels + + # Clamp coordinates to ensure they are within the image boundaries + box_start = np.maximum(box_start, 0) + box_end = np.minimum(box_end, image_shape_voxels) + + # Recalculate start to handle cases where box goes over the 0-boundary + box_start = np.maximum(box_end - box_size_voxels, 0) + + box_start = box_start.astype(int) + box_end = box_end.astype(int) + + # Apply the crop to both CT and PET + ct_cropped = ct_array[box_start[0]:box_end[0], box_start[1]:box_end[1], box_start[2]:box_end[2]] + pet_cropped = pet_array[box_start[0]:box_end[0], box_start[1]:box_end[1], box_start[2]:box_end[2]] + + return ct_cropped, pet_cropped + + def _load_model_components(self): + """Load ensemble model and clinical preprocessors.""" + # Load ensemble model + ensemble_path = self.resource_path / "ensemble_model.pt" + if not ensemble_path.exists(): + raise FileNotFoundError(f"Ensemble model not found: {ensemble_path}") + + self.ensemble_data = torch.load(ensemble_path, map_location=self.device, weights_only=False) + + # Initialize fold models + self.fold_models = [] + for fold_data in self.ensemble_data['fold_models']: + fold_id = fold_data['fold_id'] + weight = fold_data['weight'] + + # Create feature extractor + feature_extractor = FusedFeatureExtractor( + clinical_feature_dim=self.ensemble_data['clinical_feature_dim'], + feature_output_dim=self.ensemble_data['feature_output_dim'] + ).to(self.device) + + # Load weights + feature_extractor.load_state_dict(fold_data['feature_extractor_state_dict']) + feature_extractor.eval() + + # Store fold model + fold_model = { + 'fold_id': fold_id, + 'feature_extractor': feature_extractor, + 'icare_model': fold_data['icare_model'], + 'weight': weight + } + + self.fold_models.append(fold_model) + + # Load clinical preprocessors + preprocessors_path = self.resource_path / "clinical_preprocessors.pkl" + if preprocessors_path.exists(): + with open(preprocessors_path, 'rb') as f: + self.clinical_preprocessors = pickle.load(f) + else: + raise FileNotFoundError(f"Clinical preprocessors not found at {preprocessors_path}") + + def _preprocess_images(self, ct_image, pet_image): + """ + Preprocess CT and PET images with resampling, cropping, and MONAI transforms. + + Args: + ct_image: CT image as numpy array + pet_image: PET image as numpy array + + Returns: + Combined CT+PET tensor ready for model inference + """ + print("Starting image preprocessing...") + + # Step 1: Resample images to consistent resolution + print(f"Resampling images to {self.resampling} mm resolution...") + ct_resampled, pet_resampled = self._resample_images(ct_image, pet_image) + print(f"Resampled shapes - CT: {ct_resampled.shape}, PET: {pet_resampled.shape}") + + # Step 2: Crop head and neck region + print(f"Cropping neck region with box size {self.crop_box_size} mm...") + ct_cropped, pet_cropped = self._crop_neck_region(ct_resampled, pet_resampled) + print(f"Cropped shapes - CT: {ct_cropped.shape}, PET: {pet_cropped.shape}") + + # Step 3: Apply MONAI transforms + print("Applying MONAI transforms...") + ct_transformed = self.image_transforms(ct_cropped) + pet_transformed = self.image_transforms(pet_cropped) + + # Step 4: Combine CT and PET channels + combined_image = torch.cat([ct_transformed, pet_transformed], dim=0) + + # Add batch dimension + combined_image = combined_image.unsqueeze(0) # [1, 2, H, W, D] + + print(f"Final preprocessed shape: {combined_image.shape}") + return combined_image.to(self.device) + + def _preprocess_clinical_data(self, clinical_data): + """ + Preprocess clinical data from EHR JSON. + + Args: + clinical_data: Dictionary containing clinical information + + Returns: + Preprocessed clinical features as tensor + """ + try: + # Extract values, handling NaN/None + def handle_nan(value): + if value is None or (isinstance(value, (int, float)) and math.isnan(value)): + return float('nan') + return value + + age = handle_nan(clinical_data.get('Age', None)) + gender = handle_nan(clinical_data.get('Gender', None)) + tobacco = handle_nan(clinical_data.get('Tobacco Consumption', None)) + alcohol = handle_nan(clinical_data.get('Alcohol Consumption', None)) + performance = handle_nan(clinical_data.get('Performance Status', None)) + treatment = handle_nan(clinical_data.get('Treatment', None)) + m_stage = clinical_data.get('M-stage', None) + + # Create DataFrame exactly like training expects + patient_df = pd.DataFrame({ + 'PatientID': ['TEMP_PATIENT'], + 'Age': [age], + 'Gender': [gender], + 'Tobacco Consumption': [tobacco], + 'Alcohol Consumption': [alcohol], + 'Performance Status': [performance], + 'M-stage': [m_stage], + 'Treatment': [treatment] + }) + + except Exception as e: + print(f"Error extracting clinical data: {e}") + raise e + + # Use the existing preprocessing method + clinical_result = self.preprocess_test_clinical_data(patient_df) + processed_features = clinical_result['features']['TEMP_PATIENT'] + + return torch.tensor(processed_features, dtype=torch.float32).unsqueeze(0).to(self.device) + + def preprocess_test_clinical_data(self, dataframe): + """ + Preprocess clinical data using the same parameters as training. + """ + # All clinical features (same as training) + ALL_CLINICAL_FEATURES = [ + "Age", "Gender", "Tobacco Consumption", "Alcohol Consumption", + "Performance Status", "M-stage", "Treatment" + ] + + CATEGORICAL_FEATURES = [ + "Gender", "Tobacco Consumption", "Alcohol Consumption", + "Performance Status", "M-stage", "Treatment" + ] + + feature_subset = dataframe[ALL_CLINICAL_FEATURES].copy() + + # Handle Age using training parameters + age_median = self.clinical_preprocessors['age_median'] + age_scaler = self.clinical_preprocessors['age_scaler'] + + feature_subset["Age"] = feature_subset["Age"].fillna(age_median) + age_scaled = age_scaler.transform(feature_subset[["Age"]]) + + # Handle categorical features + categorical_data = feature_subset[CATEGORICAL_FEATURES].copy() + for col in CATEGORICAL_FEATURES: + categorical_data[col] = categorical_data[col].fillna('Unknown') + categorical_data[col] = categorical_data[col].astype(str) + + # Apply one-hot encoding (same structure as training) + categorical_encoded = pd.get_dummies( + categorical_data, + columns=CATEGORICAL_FEATURES, + prefix=CATEGORICAL_FEATURES, + dummy_na=False, + drop_first=False + ) + + # Ensure same feature structure as training + training_categorical_columns = [col for col in self.clinical_preprocessors['categorical_columns']] + + # Add missing columns with zeros (ensure they are numeric) + for col in training_categorical_columns: + if col not in categorical_encoded.columns: + categorical_encoded[col] = 0 + + # Remove extra columns and reorder to match training + categorical_encoded = categorical_encoded[training_categorical_columns] + + # CRITICAL: Ensure all categorical features are numeric + categorical_encoded = categorical_encoded.astype(np.float32) + + # Process all patients + processed_features = {} + + for idx, row in dataframe.iterrows(): + patient_id = row["PatientID"] + patient_row_idx = dataframe.index.get_loc(idx) + + age_features = age_scaled[patient_row_idx].flatten().astype(np.float32) + categorical_features = categorical_encoded.iloc[patient_row_idx].values.astype(np.float32) + + complete_features = np.concatenate([age_features, categorical_features]).astype(np.float32) + + # Verify no object types + if complete_features.dtype == np.object_: + # Convert any remaining object types to float + complete_features = complete_features.astype(np.float32) + + processed_features[patient_id] = complete_features + + return { + 'features': processed_features, + 'preprocessors': self.clinical_preprocessors + } + + def predict_single_patient(self, ct_image, pet_image, clinical_data): + """ + Predict RFS risk for a single patient. + + Args: + ct_image: CT image as numpy array + pet_image: PET image as numpy array + clinical_data: Clinical data as dictionary + + Returns: + RFS risk prediction as float + """ + # Preprocess images (includes resampling, cropping, and MONAI transforms) + image_tensor = self._preprocess_images(ct_image, pet_image) + + # Preprocess clinical data + clinical_tensor = self._preprocess_clinical_data(clinical_data) + + # Get predictions from all folds + fold_predictions = [] + fold_weights = [] + + for fold_model in self.fold_models: + fold_id = fold_model['fold_id'] + weight = fold_model['weight'] + + # Extract features + with torch.no_grad(): + features = fold_model['feature_extractor'](image_tensor, clinical_tensor) + features_np = features.cpu().numpy() + + # Get prediction from icare model + prediction = fold_model['icare_model'].predict(features_np)[0] + + fold_predictions.append(prediction) + fold_weights.append(weight) + + # Combine predictions + fold_predictions = np.array(fold_predictions) + combination_method = self.ensemble_data['combination_method'] + + if combination_method == "median": + final_prediction = np.median(fold_predictions) + elif combination_method == "average": + final_prediction = np.mean(fold_predictions) + elif combination_method == "weighted_average": + fold_weights = np.array(fold_weights) + normalized_weights = fold_weights / np.sum(fold_weights) + final_prediction = np.average(fold_predictions, weights=normalized_weights) + elif combination_method == "best_fold": + best_fold_idx = np.argmax(fold_weights) + final_prediction = fold_predictions[best_fold_idx] + else: + final_prediction = np.median(fold_predictions) + + return float(final_prediction) \ No newline at end of file diff --git a/Task_2/test/input/ehr.json b/Task_2/test/input/ehr.json new file mode 100644 index 0000000..2a8d8c3 --- /dev/null +++ b/Task_2/test/input/ehr.json @@ -0,0 +1,9 @@ +{ "CenterID": 5, + "Age": 56, + "Gender": 1, + "Tobacco Consumption": 0, + "Alcohol Consumption": 1, + "Performance Status": 0, + "Treatment": 0, + "M-stage": "M0" +} diff --git a/Task_2/test/output/rfs.json b/Task_2/test/output/rfs.json new file mode 100644 index 0000000..c68bfae --- /dev/null +++ b/Task_2/test/output/rfs.json @@ -0,0 +1 @@ +-0.39611914675643234 \ No newline at end of file diff --git a/Task_3/.gitattributes b/Task_3/.gitattributes new file mode 100644 index 0000000..e61a56b --- /dev/null +++ b/Task_3/.gitattributes @@ -0,0 +1 @@ +resources/checkpoints/* filter=lfs diff=lfs merge=lfs -text \ No newline at end of file diff --git a/Task_3/.gitignore b/Task_3/.gitignore new file mode 100644 index 0000000..ccca3e2 --- /dev/null +++ b/Task_3/.gitignore @@ -0,0 +1,3 @@ +resources/checkpoints/* +!resources/checkpoints/.gitkeep + diff --git a/Task_3/Dockerfile b/Task_3/Dockerfile new file mode 100644 index 0000000..5dafbca --- /dev/null +++ b/Task_3/Dockerfile @@ -0,0 +1,27 @@ +# Use a 'large' base container to show-case how to load pytorch (macOS) +# FROM --platform=linux/arm64 pytorch/pytorch AS example-task2-arm64 + +# Use a 'large' base container to show-case how to load pytorch and use the GPU (when enabled) (Linux and WSL) +FROM --platform=linux/amd64 pytorch/pytorch:2.6.0-cuda12.4-cudnn9-runtime AS example-task3-amd64 + +# Ensures that Python output to stdout/stderr is not buffered: prevents missing information when terminating +ENV PYTHONUNBUFFERED=1 + +RUN groupadd -r user && useradd -m --no-log-init -r -g user user +USER user + +WORKDIR /opt/app + +COPY --chown=user:user requirements.txt /opt/app/ +COPY --chown=user:user resources /opt/app/resources + +# You can add any Python dependencies to requirements.txt +RUN python -m pip install \ + --user \ + --no-cache-dir \ + --no-color \ + --requirement /opt/app/requirements.txt + +COPY --chown=user:user inference.py /opt/app/ + +ENTRYPOINT ["python", "inference.py"] diff --git a/Task_3/do_build.sh b/Task_3/do_build.sh new file mode 100644 index 0000000..0091dfe --- /dev/null +++ b/Task_3/do_build.sh @@ -0,0 +1,18 @@ +#!/usr/bin/env bash + +# Stop at first error +set -e + +SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) +DOCKER_IMAGE_TAG="example-algorithm-sanity-check-task-3" + + +# Check if an argument is provided +if [ "$#" -eq 1 ]; then + DOCKER_IMAGE_TAG="$1" +fi + +# Note: the build-arg is JUST for the workshop +docker build "$SCRIPT_DIR" \ + --platform=linux/amd64 \ + --tag "$DOCKER_IMAGE_TAG" 2>&1 \ No newline at end of file diff --git a/Task_3/do_save.sh b/Task_3/do_save.sh new file mode 100644 index 0000000..8b93e9e --- /dev/null +++ b/Task_3/do_save.sh @@ -0,0 +1,37 @@ +#!/usr/bin/env bash + +# Stop at first error +set -e + +SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) + +# Set default container name +DOCKER_IMAGE_TAG="example-algorithm-sanity-check-task-3" + +# Check if an argument is provided +if [ "$#" -eq 1 ]; then + DOCKER_IMAGE_TAG="$1" +fi + +echo "=+= (Re)build the container" +source "${SCRIPT_DIR}/do_build.sh" "$DOCKER_IMAGE_TAG" + +# Get the build information from the Docker image tag +build_timestamp=$( docker inspect --format='{{ .Created }}' "$DOCKER_IMAGE_TAG") + +if [ -z "$build_timestamp" ]; then + echo "Error: Failed to retrieve build information for container $DOCKER_IMAGE_TAG" + exit 1 +fi + +# Format the build information to remove special characters +formatted_build_info=$(echo $build_timestamp | sed -E 's/(.*)T(.*)\..*Z/\1_\2/' | sed 's/[-,:]/-/g') + +# Set the output filename with timestamp and build information +output_filename="${SCRIPT_DIR}/${DOCKER_IMAGE_TAG}_${formatted_build_info}.tar.gz" + +# Save the Docker container and gzip it +echo "Saving the container as ${output_filename}. This can take a while." +docker save "$DOCKER_IMAGE_TAG" | gzip -c > "$output_filename" + +echo "Container saved as ${output_filename}" \ No newline at end of file diff --git a/Task_3/do_test_run.sh b/Task_3/do_test_run.sh new file mode 100644 index 0000000..272678a --- /dev/null +++ b/Task_3/do_test_run.sh @@ -0,0 +1,78 @@ +#!/usr/bin/env bash + +# Stop at first error +set -e + +SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) +DOCKER_IMAGE_TAG="example-algorithm-sanity-check-task-3" + +# Check if an argument is provided +if [ "$#" -eq 1 ]; then + DOCKER_IMAGE_TAG="$1" +fi + +DOCKER_NOOP_VOLUME="${DOCKER_IMAGE_TAG}-volume" + +INPUT_DIR="${SCRIPT_DIR}/test/input" +OUTPUT_DIR="${SCRIPT_DIR}/test/output" + +echo "=+= (Re)build the container" +source "${SCRIPT_DIR}/do_build.sh" "$DOCKER_IMAGE_TAG" + +cleanup() { + echo "=+= Cleaning permissions ..." + # Ensure permissions are set correctly on the output + # This allows the host user (e.g. you) to access and handle these files + docker run --rm \ + --quiet \ + --volume "$OUTPUT_DIR":/output \ + --entrypoint /bin/sh \ + $DOCKER_IMAGE_TAG \ + -c "chmod -R -f o+rwX /output/* || true" +} + + +echo "=+= Cleaning up any earlier output" +if [ -d "$OUTPUT_DIR" ]; then + # Ensure permissions are setup correctly + # This allows for the Docker user to write to this location + rm -rf "${OUTPUT_DIR}"/* + chmod -f o+rwx "$OUTPUT_DIR" +else + mkdir -m o+rwx "$OUTPUT_DIR" +fi + +trap cleanup EXIT + +echo "=+= Doing a forward pass" +## Note the extra arguments that are passed here: +# '--network none' +# entails there is no internet connection +# 'gpus all' +# enables access to any GPUs present +# '--volume :/tmp' +# is added because on Grand Challenge this directory cannot be used to store permanent files +docker volume create "$DOCKER_NOOP_VOLUME" > /dev/null +docker run --rm \ + --platform=linux/amd64 \ + --network none \ + --gpus all \ + --volume "$INPUT_DIR":/input:ro \ + --volume "$OUTPUT_DIR":/output \ + --volume "$DOCKER_NOOP_VOLUME":/tmp \ + $DOCKER_IMAGE_TAG +docker volume rm "$DOCKER_NOOP_VOLUME" > /dev/null + +# Ensure permissions are set correctly on the output +# This allows the host user (e.g. you) to access and handle these files +docker run --rm \ + --quiet \ + --env HOST_UID=`id -u` \ + --env HOST_GID=`id -g` \ + --volume "$OUTPUT_DIR":/output \ + alpine:latest \ + /bin/sh -c 'chown -R ${HOST_UID}:${HOST_GID} /output' + +echo "=+= Wrote results to ${OUTPUT_DIR}" + +echo "=+= Save this image for uploading via ./do_save.sh \"${DOCKER_IMAGE_TAG}\"" \ No newline at end of file diff --git a/Task_3/inference.py b/Task_3/inference.py new file mode 100644 index 0000000..b6f466b --- /dev/null +++ b/Task_3/inference.py @@ -0,0 +1,119 @@ +""" +The following is a simple example algorithm. + +It is meant to run within a container. + +To run it locally, you can call the following bash script: + + ./do_test_run.sh + +This will start the inference and reads from ./test/input and outputs to ./test/output + +To save the container and prep it for upload to Grand-Challenge.org you can call: + + ./do_save.sh + +Any container that shows the same behavior will do, this is purely an example of how one COULD do it. + +Happy programming! +""" +from pathlib import Path +import json +from glob import glob +import SimpleITK + +# Paths inside the container, do not change +INPUT_PATH = Path("/input") +OUTPUT_PATH = Path("/output") +RESOURCE_PATH = Path("resources") + +# Import the necessary functions from the resources +from resources.model import MultiModalResNet +from resources.hecktor_inference import prepare_input_tensor, preprocess_ehr, run_inference, resample_images, crop_neck_region +import torch +from joblib import load + + +def run(): + # Read the input + input_ct_image = load_image_file_as_array( + location=INPUT_PATH / "images/ct", + ) + input_electronic_health_record = load_json_file( + location=INPUT_PATH / "ehr.json", + ) + input_pet_image = load_image_file_as_array( + location=INPUT_PATH / "images/pet", + ) + + _show_torch_cuda_info() + + # Preprocess the inputs + ct_resampled_array, pet_resampled_array = resample_images( + ct_array=input_ct_image, + pet_array=input_pet_image, + ) + + ct_cropped, pet_cropped = crop_neck_region( + ct_array=ct_resampled_array, + pet_array=pet_resampled_array, + ) + + x_img = prepare_input_tensor(ct_cropped, pet_cropped) + + scaler = load(RESOURCE_PATH / "checkpoints/scaler.joblib") + ohe = load(RESOURCE_PATH / "checkpoints/ohe.joblib") + x_clin = preprocess_ehr(input_electronic_health_record, scaler, ohe) + + # Load HECKTOR model and make prediction + model = MultiModalResNet(clin_feat_dim=x_clin.shape[1], num_classes=2).cuda() + model.load_state_dict(torch.load(RESOURCE_PATH / "checkpoints/best_model.pt")) + + output_hpv_status = run_inference(model=model, x_img=x_img, x_clin=x_clin) + + # Save your output + write_json_file(location=OUTPUT_PATH / "hpv-status.json", content=output_hpv_status) + + return 0 + + +def load_json_file(*, location): + # Reads a json file + with open(location, "r") as f: + return json.loads(f.read()) + + +def write_json_file(*, location, content): + # Writes a json file + with open(location, "w") as f: + f.write(json.dumps(content, indent=4)) + + +def load_image_file_as_array(*, location): + # Use SimpleITK to read a file + input_files = ( + glob(str(location / "*.tif")) + + glob(str(location / "*.tiff")) + + glob(str(location / "*.mha")) + ) + result = SimpleITK.ReadImage(input_files[0]) + + # Convert it to a Numpy array + return SimpleITK.GetArrayFromImage(result) + + +def _show_torch_cuda_info(): + import torch + + print("=+=" * 10) + print("Collecting Torch CUDA information") + print(f"Torch CUDA is available: {(available := torch.cuda.is_available())}") + if available: + print(f"\tnumber of devices: {torch.cuda.device_count()}") + print(f"\tcurrent device: { (current_device := torch.cuda.current_device())}") + print(f"\tproperties: {torch.cuda.get_device_properties(current_device)}") + print("=+=" * 10) + + +if __name__ == "__main__": + raise SystemExit(run()) diff --git a/Task_3/requirements.txt b/Task_3/requirements.txt new file mode 100644 index 0000000..4d8c5e4 --- /dev/null +++ b/Task_3/requirements.txt @@ -0,0 +1,7 @@ +SimpleITK +pandas +numpy +monai +joblib +scikit-learn +scikit-image \ No newline at end of file diff --git a/Task_3/resources/checkpoints/.gitkeep b/Task_3/resources/checkpoints/.gitkeep new file mode 100644 index 0000000..e69de29 diff --git a/Task_3/resources/hecktor_inference.py b/Task_3/resources/hecktor_inference.py new file mode 100644 index 0000000..4670b2f --- /dev/null +++ b/Task_3/resources/hecktor_inference.py @@ -0,0 +1,196 @@ +import torch +import numpy as np +from monai.transforms import Resize +import pandas as pd +import SimpleITK as sitk +import warnings +from skimage.measure import label + +def get_bounding_boxes(ct_sitk, pet_sitk): + """ + Get the bounding boxes of the CT and PET images. + This works since all images have the same direction. + """ + ct_origin = np.array(ct_sitk.GetOrigin()) + pet_origin = np.array(pet_sitk.GetOrigin()) + + ct_position_max = ct_origin + np.array(ct_sitk.GetSize()) * np.array(ct_sitk.GetSpacing()) + pet_position_max = pet_origin + np.array(pet_sitk.GetSize()) * np.array(pet_sitk.GetSpacing()) + + return np.concatenate([ + np.maximum(ct_origin, pet_origin), + np.minimum(ct_position_max, pet_position_max), + ], axis=0) + +def resample_images(ct_array, pet_array): + """ + Resample CT and PET images to specified resolution using SimpleITK. + + Args: + ct_array: CT image as numpy array + pet_array: PET image as numpy array + + Returns: + Tuple of (resampled_ct_array, resampled_pet_array) + """ + # Convert numpy arrays to SimpleITK images + ct_sitk = sitk.GetImageFromArray(ct_array) + pet_sitk = sitk.GetImageFromArray(pet_array) + + # Set default spacing and origin if not present + ct_sitk.SetSpacing([1.0, 1.0, 1.0]) + pet_sitk.SetSpacing([1.0, 1.0, 1.0]) + ct_sitk.SetOrigin([0.0, 0.0, 0.0]) + pet_sitk.SetOrigin([0.0, 0.0, 0.0]) + + # Get bounding box for both modalities + bb = get_bounding_boxes(ct_sitk, pet_sitk) + resampling= (1.0, 1.0, 1.0) + size = np.round((bb[3:] - bb[:3]) / np.array(resampling)).astype(int) + + # Set up resampler + resampler = sitk.ResampleImageFilter() + resampler.SetOutputDirection([1, 0, 0, 0, 1, 0, 0, 0, 1]) + resampler.SetOutputSpacing(resampling) + resampler.SetOutputOrigin(bb[:3]) + resampler.SetSize([int(k) for k in size]) + + # Resample CT with B-spline interpolation + resampler.SetInterpolator(sitk.sitkBSpline) + ct_resampled = resampler.Execute(ct_sitk) + + # Resample PET with B-spline interpolation + pet_resampled = resampler.Execute(pet_sitk) + + # Convert back to numpy arrays + ct_resampled_array = sitk.GetArrayFromImage(ct_resampled) + pet_resampled_array = sitk.GetArrayFromImage(pet_resampled) + + return ct_resampled_array, pet_resampled_array + +def get_roi_center(pet_tensor, z_top_fraction=0.75, z_score_threshold=1.0): + """ + Calculates the center of the largest high-intensity region in the top part of the PET scan. + """ + # 1. Isolate top of the scan based on the z-axis + image_shape_voxels = np.array(pet_tensor.shape) + crop_z_start = int(z_top_fraction * image_shape_voxels[2]) + top_of_scan = pet_tensor[..., crop_z_start:] + + # 2. Threshold to find high-intensity regions (potential brain/tumor) + # Using a small epsilon to avoid division by zero in blank images + mask = ((top_of_scan - top_of_scan.mean()) / (top_of_scan.std() + 1e-8)) > z_score_threshold + + if not mask.any(): + # If no pixels are above the threshold, fall back to the geometric center of the top part + warnings.warn("No high-intensity region found. Using geometric center of the upper scan region.") + center_in_top = (np.array(top_of_scan.shape) / 2).astype(int) + else: + # Find the largest connected component to remove noise + labeled_mask, num_features = label(mask, return_num=True, connectivity=3) + if num_features > 0: + component_sizes = np.bincount(labeled_mask.ravel())[1:] # ignore background + largest_component_label = np.argmax(component_sizes) + 1 + largest_component_mask = labeled_mask == largest_component_label + comp_idx = np.argwhere(largest_component_mask) + else: # Should not happen if mask.any() is true, but as a safeguard + comp_idx = np.argwhere(mask) + + # 3. Calculate the centroid of the largest component + center_in_top = np.mean(comp_idx, axis=0) + + # 4. Adjust center to be in the original full-image coordinate system + center_full_image = center_in_top + np.array([0, 0, crop_z_start]) + return center_full_image.astype(int) + +def crop_neck_region(ct_array, pet_array): + """ + Crop the head and neck region from CT and PET images. + + Args: + ct_array: CT image as numpy array + pet_array: PET image as numpy array + + Returns: + Tuple of (cropped_ct_array, cropped_pet_array) + """ + # Convert to torch tensor for processing + pet_tensor = torch.from_numpy(pet_array).float() + + # Get box size in voxels (assuming 1mm spacing after resampling) + crop_box_size=[200, 200, 310] + crop_box_size = np.array(crop_box_size) + box_size_voxels = crop_box_size.astype(int) + + # 1. Find the robust center of the ROI using PET + center_voxels = get_roi_center(pet_tensor) + + # 2. Calculate crop box and handle boundaries safely + image_shape_voxels = np.array(pet_array.shape) + box_start = center_voxels - box_size_voxels // 2 + box_end = box_start + box_size_voxels + + # Clamp coordinates to ensure they are within the image boundaries + box_start = np.maximum(box_start, 0) + box_end = np.minimum(box_end, image_shape_voxels) + + # Recalculate start to handle cases where box goes over the 0-boundary + box_start = np.maximum(box_end - box_size_voxels, 0) + + box_start = box_start.astype(int) + box_end = box_end.astype(int) + + # Apply the crop to both CT and PET + ct_cropped = ct_array[box_start[0]:box_end[0], box_start[1]:box_end[1], box_start[2]:box_end[2]] + pet_cropped = pet_array[box_start[0]:box_end[0], box_start[1]:box_end[1], box_start[2]:box_end[2]] + + return ct_cropped, pet_cropped + +def preprocess_image(img): + img = img.astype(np.float32) + img = (img - img.min()) / (img.max() - img.min() + 1e-5) # normalize + img = torch.tensor(img) # [D, H, W] + img = Resize((96, 96, 96))(img.unsqueeze(0)) # [1, D, H, W] + return img # [1, D, H, W] + + +def prepare_input_tensor(ct_image, pet_image): + + ct_tensor = preprocess_image(ct_image) + pet_tensor = preprocess_image(pet_image) + + x_img = torch.cat([ct_tensor, pet_tensor], dim=0).unsqueeze(0).cuda() # [1,2,D,H,W] + + return x_img + +def preprocess_ehr(ehr, scaler, ohe): + # handle missing values and convert to DataFrame + df_ehr = pd.DataFrame([{ + "Age": ehr.get("Age", 0), + "Gender": ehr.get("Gender", "Unknown"), + "Tobacco Consumption": ehr.get("Tobacco Consumption", "Unknown"), + "Alcohol Consumption": ehr.get("Alcohol Consumption", "Unknown"), + "Performance Status": ehr.get("Performance Status", "Unknown"), + "M-stage": ehr.get("M-stage", "Unknown") + }]) + + num_data = df_ehr[["Age", "Gender"]].values + cat_data = df_ehr[["Tobacco Consumption", "Alcohol Consumption", "Performance Status", "M-stage"]].astype(str).fillna("Unknown").values + + + num_feats = scaler.transform(num_data) + cat_feats = ohe.transform(cat_data) + x_clin = np.hstack([num_feats, cat_feats]) + x_clin = torch.tensor(x_clin, dtype=torch.float32).cuda() + + return x_clin + + +def run_inference(model, x_img, x_clin): + model.eval() + + with torch.no_grad(): + logits = model(x_img, x_clin) + pred = logits.argmax(dim=1).item() # 0 or 1 + + return bool(pred) diff --git a/Task_3/resources/model.py b/Task_3/resources/model.py new file mode 100644 index 0000000..95a0835 --- /dev/null +++ b/Task_3/resources/model.py @@ -0,0 +1,35 @@ +import torch +import torch.nn as nn +from monai.networks.nets import resnet18 + +class MultiModalResNet(nn.Module): + def __init__(self, clin_feat_dim, num_classes=2): + super().__init__() + # 3D ResNet18 backbone for CT+PET + self.img_model = resnet18( + spatial_dims=3, + n_input_channels=2, + num_classes=2, + ) + self.img_model.fc = nn.Identity() + + # MLP for clinical data + self.clin_model = nn.Sequential( + nn.Linear(clin_feat_dim, 64), + nn.ReLU(), + nn.Linear(64, 32), + nn.ReLU() + ) + + # fusion + classification + self.classifier = nn.Sequential( + nn.Linear(512 + 32, 128), + nn.ReLU(), + nn.Linear(128, num_classes) + ) + + def forward(self, x_img, x_clin): + f_img = self.img_model(x_img) + f_clin = self.clin_model(x_clin) + f = torch.cat([f_img, f_clin], dim=1) + return self.classifier(f) \ No newline at end of file diff --git a/Task_3/test/.gitignore b/Task_3/test/.gitignore new file mode 100644 index 0000000..ea1472e --- /dev/null +++ b/Task_3/test/.gitignore @@ -0,0 +1 @@ +output/ diff --git a/Task_3/test/input/ehr.json b/Task_3/test/input/ehr.json new file mode 100644 index 0000000..8252151 --- /dev/null +++ b/Task_3/test/input/ehr.json @@ -0,0 +1,9 @@ +{ + "Age": 71, + "Gender": 1, + "Tobacco Consumption": 0, + "Alcohol Consumption": 1, + "Performance Status": 0, + "Treatment": 1, + "M-stage": "M0" +} \ No newline at end of file diff --git a/Task_3/test/input/images/ct/e780be06-9b6f-4e13-9eba-35bdef4177fe.mha b/Task_3/test/input/images/ct/e780be06-9b6f-4e13-9eba-35bdef4177fe.mha new file mode 100644 index 0000000..2feed32 Binary files /dev/null and b/Task_3/test/input/images/ct/e780be06-9b6f-4e13-9eba-35bdef4177fe.mha differ diff --git a/Task_3/test/input/images/pet/0cc1d7a4-27b4-4475-a7b3-32081e9f7bac.mha b/Task_3/test/input/images/pet/0cc1d7a4-27b4-4475-a7b3-32081e9f7bac.mha new file mode 100644 index 0000000..2feed32 Binary files /dev/null and b/Task_3/test/input/images/pet/0cc1d7a4-27b4-4475-a7b3-32081e9f7bac.mha differ diff --git a/assets/images/Add-Algorithm.png b/assets/images/Add-Algorithm.png new file mode 100644 index 0000000..ff7b068 Binary files /dev/null and b/assets/images/Add-Algorithm.png differ diff --git a/assets/images/Algorithm-Details.png b/assets/images/Algorithm-Details.png new file mode 100644 index 0000000..7a6a6a4 Binary files /dev/null and b/assets/images/Algorithm-Details.png differ diff --git a/assets/images/HECKTOR-main.jpeg b/assets/images/HECKTOR-main.jpeg new file mode 100644 index 0000000..d2d6f69 Binary files /dev/null and b/assets/images/HECKTOR-main.jpeg differ diff --git a/assets/images/Input-Output-Container.png b/assets/images/Input-Output-Container.png new file mode 100644 index 0000000..cb9910a Binary files /dev/null and b/assets/images/Input-Output-Container.png differ diff --git a/assets/images/New-Container.png b/assets/images/New-Container.png new file mode 100644 index 0000000..8fcd057 Binary files /dev/null and b/assets/images/New-Container.png differ diff --git a/assets/images/Phases.png b/assets/images/Phases.png new file mode 100644 index 0000000..768d132 Binary files /dev/null and b/assets/images/Phases.png differ diff --git a/assets/images/Select-Containers.png b/assets/images/Select-Containers.png new file mode 100644 index 0000000..a3c6635 Binary files /dev/null and b/assets/images/Select-Containers.png differ diff --git a/assets/images/Submission-Tab-1.png b/assets/images/Submission-Tab-1.png new file mode 100644 index 0000000..8411366 Binary files /dev/null and b/assets/images/Submission-Tab-1.png differ diff --git a/assets/images/Submission-Tab-2.png b/assets/images/Submission-Tab-2.png new file mode 100644 index 0000000..b75f910 Binary files /dev/null and b/assets/images/Submission-Tab-2.png differ diff --git a/assets/images/Upload-Containers.png b/assets/images/Upload-Containers.png new file mode 100644 index 0000000..d6f2ee5 Binary files /dev/null and b/assets/images/Upload-Containers.png differ diff --git a/assets/images/submissions.JPG b/assets/images/submissions.JPG new file mode 100644 index 0000000..e4dd052 Binary files /dev/null and b/assets/images/submissions.JPG differ diff --git a/assets/images/submit_algorithm.jpg b/assets/images/submit_algorithm.jpg new file mode 100644 index 0000000..71521da Binary files /dev/null and b/assets/images/submit_algorithm.jpg differ diff --git a/assets/logos/countdown.png b/assets/logos/countdown.png new file mode 100644 index 0000000..705393c Binary files /dev/null and b/assets/logos/countdown.png differ diff --git a/assets/logos/restrictions.svg b/assets/logos/restrictions.svg new file mode 100644 index 0000000..eac8c48 --- /dev/null +++ b/assets/logos/restrictions.svg @@ -0,0 +1,9 @@ + + + + + + + + + \ No newline at end of file diff --git a/submission-guidelines.md b/submission-guidelines.md new file mode 100644 index 0000000..f13b2ab --- /dev/null +++ b/submission-guidelines.md @@ -0,0 +1,60 @@ +## Submitting Docker Container + + +Assuming you have a verified Grand-Challenge account and have already registered for the HECKTOR challenge, you need to do two main steps to submit your algorithm to the challenge. First, you need to [upload the algorithm](#uplaod-your-algorithm) docker container to the Grand-Challenge platform. Then, you can follow the steps to [submit that algorithm](#submit-your-algorithm) to compete in any leaderboard or phases of the challenge. But before you proceed, make sure that you have read and understood the [participation policies](https://hecktor25.grand-challenge.org/participation-policies/). + +### 1- Upload your algorithm +> **IMPORTANT:** It is crucial to know that you have to submit different algorithms for different tasks of the challenge. Even if you are using the same method for all tasks, you have to upload your algorithm again because the Input and Output configurations for each tasks are different. + +In order to submit your algorithm, you first have to add it to the Grand-Challenge platform. To do so, you have to follow the following steps: + +- First, navigate to the [algorithm submission webpage](https://grand-challenge.org/algorithms/) and click on the "+ Add new algorithm" botttom: + +

+ +

+ +- Then you will be directed to the "Create Algorithm" page where you have to choose the phase for which you are creating an algorithm using a drop-down list. Select the phase and click on "Create an Algorithm for this Phase". For more details on different phases, please click [here](https://hecktor25.grand-challenge.org/submission-instructions/) + +

+ +

+ +- The next step is to enter the the **Title** and **Job Description** for the algorithm. + +

+ +

+ +> **NOTE:** Since you can only create a limited number of algorithms, please make the title of your algorithm meaningful and avoid titles that include the words "test", "debug" etc. In principle you will only need to create 1 algorithm for this phase. Once created, you can upload new container images for it as you improve your code and even switch back to older container images as you see fit. + + +- After successful creation of an algorithm, you have to connect the docker container. Now, before you can use this algorithm for a challenge submission, you have to assign/upload your dockerized algorithm to it. To do so, click on the "Containers" tab from the left menu. + + +

+ +

+ + +- There are two ways for uploading the containers. Either you need to link your algorithm to a **GitHub** repo and create a new tag, or upload a valid algorithm container image. + + +

+ +

+ +- To upload a container, please make sure the file is in ```.tar.gz``` format produced from the command ```docker save IMAGE | gzip -c > IMAGE.tar.gz```. For more details, please see how to [save the container](https://docs.docker.com/engine/reference/commandline/save/). + +- The input/output for a container is also provided which determines what kind of input/outputs files are expected/generated by a container for each task. For more details on the specific format, you can click on the â„šī¸ icon. + +

+ +

+ + +- Once you have uploaded your docker container, click on the **Save** button given below and your algorithm will be completed and ready to be submitted to the challenge. Remember, the algorithm is only ready to submit when the status badge in front of the upload description changes to **Active**. + +

+ +