SPECTRE is a Transformer-based foundation model for 3D Computed Tomography (CT) scans, trained using self-supervised learning (SSL) and cross-modal vision–language alignment (VLA). It provides rich and generalizable representations from medical imaging data, which can be fine-tuned for downstream tasks such as segmentation, classification, and anomaly detection.
SPECTRE has been trained on a large cohort of open-source CT scans of the human abdomen and thorax, as well as paired radiology reports and Electronic Health Record data, enabling it to capture representations that generalize across datasets and clinical settings.
This repository provides pretrained SPECTRE models together with tools for fine-tuning and evaluation.
The pretrained SPECTRE model can easily be imported as follows:
from spectre import SpectreImageFeatureExtractor, MODEL_CONFIGS
import torch
config = MODEL_CONFIGS['spectre-large-pretrained']
model = SpectreImageFeatureExtractor.from_config(config)
model.eval()
# Dummy input: (batch, crops, channels, height, width, depth)
# For a (3 x 3 x 4) grid of (128 x 128 x 64) CT patches -> Total scan size (384 x 384 x 256)
x = torch.randn(1, 36, 1, 128, 128, 64)
with torch.no_grad():
features = model(x, grid_size=(3, 3, 4))
print("Features shape:", features.shape)Alternatively, you can download the weights of the separate components through HuggingFace using the following links:
| Architecture | Input Modality | Pretraining Objective | Model Weights |
|---|---|---|---|
| SPECTRE-ViT-Local | CT crops | SSL | Link |
| SPECTRE-ViT-Local | CT crops | SSL + VLA | Link |
| SPECTRE-ViT-Global | Embedded CT crops | VLA | Link |
| Qwen3-Embedding-0.6B LoRA | Text (radiology) | VLA | Link |
This repository is organized as follows:
-
🚀
src/spectre/– Contains the core package, including:- Pretraining methods
- Model architectures
- Data handling and transformations
-
🛠️
src/spectre/configs/– Stores configuration files for different training settings. -
🔬
experiments/– Includes Python scripts for running various pretraining and downstream experiments. -
🐳
Dockerfile– Defines the environment for running a local version of SPECTRE inside a container.
To get up and running with SPECTRE, simply install our package using pip:
pip install spectre-fmor install the latest updates directly from GitHub:
pip install git+https://github.com/cclaess/SPECTRE.gitTo facilitate deployment and reproducibility, SPECTRE can be run using Docker. This allows you to set up a fully functional environment without manually installing dependencies using your own local copy of spectre.
First, ensure you have Docker installed. Then, clone and navigate to the repository to build the image:
git clone https://github.com/cclaess/SPECTRE
cd SPECTRE
docker build -t spectre-fm .Once the image is built, you can start a container and execute scripts inside it. For example, to run a DINO pretraining experiment:
docker run --gpus all --rm -v "$(pwd):/mnt" spectre-fm python3 experiments/pretraining/pretrain_dino.py --config_file spectre/configs/dino_default.yaml --output_dir /mnt/outputs/pretraining/dino/--gpus allenables GPU acceleration if available.--rmremoves the container after execution.-v $(pwd):/mntmounts the current directory inside the container.
- Code: MIT — see
LICENSE(permissive; commercial use permitted). - Pretrained model weights: CC-BY-NC-SA — non-commercial share-alike. The weights and any derivative models that include these weights are NOT cleared for commercial use. See
LICENSE_MODELSfor details and the precise license text.
Note: the pretrained weights are subject to the original dataset licenses. Users intending to use SPECTRE in commercial settings should verify dataset and model licensing and obtain any required permissions.
If you use SPECTRE in your research or wish to cite it, please use the following BibTeX entry of our preprint:
@misc{claessens_scaling_2025,
title = {Scaling {Self}-{Supervised} and {Cross}-{Modal} {Pretraining} for {Volumetric} {CT} {Transformers}},
url = {http://arxiv.org/abs/2511.17209},
doi = {10.48550/arXiv.2511.17209},
author = {Claessens, Cris and Viviers, Christiaan and D'Amicantonio, Giacomo and Bondarev, Egor and Sommen, Fons van der},
year={2025},
}
This project builds upon prior work in self-supervised learning, medical imaging, and transformer-based representation learning. We especially acknowledge MONAI for their awesome framework and the timm & lightly Python libraries for providing 2D PyTorch models (timm) and object-oriented self-supervised learning methods (lightly), from which we adapted parts of the code for 3D.

