Riku Murai* · Eric Dexheimer* · Andrew J. Davison
(* Equal Contribution)
Paper | Video | Project Page
conda create -n mast3r-slam python=3.11
conda activate mast3r-slam
Check the system's CUDA version with nvcc
nvcc --version
Install pytorch with matching CUDA version following:
# CUDA 11.8
conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda=11.8 -c pytorch -c nvidia
# CUDA 12.1
conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda=12.1 -c pytorch -c nvidia
# CUDA 12.4
conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda=12.4 -c pytorch -c nvidia
Clone the repo and install the dependencies.
git clone https://github.com/rmurai0610/MASt3R-SLAM.git --recursive
# if you've clone the repo without --recursive run
# git submodule update --init --recursive
pip install -e thirdparty/mast3r
pip install -e thirdparty/in3d
pip install --no-build-isolation -e .
# Optionally install torchcodec for faster mp4 loading
pip install torchcodec==0.1
Setup the checkpoints for MASt3R and retrieval. The license for the checkpoints and more information on the datasets used is written here.
mkdir -p checkpoints/
wget https://download.europe.naverlabs.com/ComputerVision/MASt3R/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric.pth -P checkpoints/
wget https://download.europe.naverlabs.com/ComputerVision/MASt3R/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric_retrieval_trainingfree.pth -P checkpoints/
wget https://download.europe.naverlabs.com/ComputerVision/MASt3R/MASt3R_ViTLarge_BaseDecoder_512_catmlpdpt_metric_retrieval_codebook.pkl -P checkpoints/
We have primarily tested on Ubuntu. If you are using WSL, please checkout to the windows branch and follow the above installation.
git checkout windows
This disables multiprocessing which causes an issue with shared memory as discussed here.
bash ./scripts/download_tum.sh
python main.py --dataset datasets/tum/rgbd_dataset_freiburg1_room/ --config config/calib.yaml
Connect a realsense camera to the PC and run
python main.py --dataset realsense --config config/base.yaml
Our system can process either MP4 videos or folders containing RGB images.
python main.py --dataset <path/to/video>.mp4 --config config/base.yaml
python main.py --dataset <path/to/folder> --config config/base.yaml
If the calibration parameters are known, you can specify them in intrinsics.yaml
python main.py --dataset <path/to/video>.mp4 --config config/base.yaml --calib config/intrinsics.yaml
python main.py --dataset <path/to/folder> --config config/base.yaml --calib config/intrinsics.yaml
bash ./scripts/download_tum.sh
bash ./scripts/download_7_scenes.sh
bash ./scripts/download_euroc.sh
bash ./scripts/download_eth3d.sh
All evaluation script will run our system in a single-threaded, headless mode. We can run evaluations with/without calibration:
bash ./scripts/eval_tum.sh
bash ./scripts/eval_tum.sh --no-calib
bash ./scripts/eval_7_scenes.sh
bash ./scripts/eval_7_scenes.sh --no-calib
bash ./scripts/eval_euroc.sh
bash ./scripts/eval_euroc.sh --no-calib
bash ./scripts/eval_eth3d.sh
There might be minor differences between the released version and the results in the paper after developing this multi-processing version. We run all our experiments on an RTX 4090, and the performance may differ when running with a different GPU.
We sincerely thank the developers and contributors of the many open-source projects that our code is built upon.
If you found this code/work to be useful in your own research, please considering citing the following:
title={{MASt3R-SLAM}: Real-Time Dense {SLAM} with {3D} Reconstruction Priors},
author={Murai, Riku and Dexheimer, Eric and Davison, Andrew J.},
journal={arXiv preprint},