Repository files navigation 3D Geometric Foundation Models (3R) with SLAM
Advances in Feed-Forward 3D Reconstruction and View Synthesis: A Survey, arXiv 2025 . [Paper ] [Website ]
3D Geometric Foundation Models (3R)
DUSt3R: Geometric 3D Vision Made Easy, CVPR 2024 . [Paper ] [Code ] [Website ]
Monst3r: A simple approach for estimating geometry in the presence of motion, ICLR 2025 . [Paper ] [Code ] [Website ]
LoRA3D: Low-Rank Self-Calibration of 3D Geometric Foundation Models, ICLR 2025 . [Paper ] [Website ]
(CUT3R) Continuous 3D Perception Model with Persistent State, CVPR 2025 . [Paper ] [Code ] [Website ]
Reloc3r: Large-Scale Training of Relative Camera Pose Regression for Generalizable, Fast, and Accurate Visual Localization, CVPR 2025 . [Paper ] [Code ]
DAS3R: Dynamics-Aware Gaussian Splatting for Static Scene Reconstruction, arXiv 2024 . [Paper ] [Code ] [Website ]
MASt3R-SfM: a Fully-Integrated Solution for Unconstrained Structure-from-Motion, 3DV 2025 . [Paper ] [Code ]
Splatt3R: Zero-shot Gaussian Splatting from Uncalibrated Image Pairs, arXiv 2024 . [Paper ] [Project ] [Code ]
SAB3R: Semantic-Augmented Backbone in 3D Reconstruction, arXiv 2024 . [Paper ]
No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images, ICLR 2025 . [Paper ] [Code ] [Website ]
Doppelgangers++: Improved Visual Disambiguation with Geometric 3D Features, CVPR 2025 . [Paper ] [Code ] [Project ]
(Spann3R) 3D Reconstruction with Spatial Memory, 3DV 2025 . [Paper ] [Code ] [Website ]
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass, CVPR 2025 . [Paper ] [Website ] [Code ]
InstantSplat: Unbounded Sparse-view Pose-free Gaussian Splatting in 40 Seconds, arXiv 2025 . [Paper ] [Website ] [Code ]
SPARS3R: Semantic Prior Alignment and Regularization for Sparse 3D Reconstruction, arXiv 2024 . [Paper ] [Code ]
Align3R: Aligned Monocular Depth Estimation for Dynamic Videos, CVPR 2025 . [Paper ] [Code ] [Website ]
(MASt3R) Grounding Image Matching in 3D with MASt3R, ECCV 2024 . [Paper ] [Code ] [Website ]
VGGT: Visual Geometry Grounded Transformer, CVPR 2025 . [Paper ] [Code ] [Website ]
E3D-Bench: A Benchmark for End-to-End 3D Geometric Foundation Models, arXiv 2025 . [Paper ] [Code ] [Website ]
π^3: Scalable Permutation-Equivariant Visual Geometry Learning, arXiv 2025 . [Paper ] [Code ] [Website ]
Dens3R: A Foundation Model for 3D Geometry Prediction, ICCV 2025 . [Paper ] [Code ] [Website ]
LONG3R: Long Sequence Streaming 3D Reconstruction, ICCV 2025 . [Paper ] [Code ] [Website ]
PanoSplatt3R: Leveraging Perspective Pretraining for Generalized Unposed Wide-Baseline Panorama Reconstruction, ICCV 2025 . [Paper ] [Code ] [Website ]
Ov3R: Open-Vocabulary Semantic 3D Reconstruction from RGB Videos, arXiv 2025 . [Paper ]
G-CUT3R: Guided 3D Reconstruction with Camera and Depth Prior Integration, arXiv 2025 . [Paper ]
ViPE: Video Pose Engine for 3D Geometric Perception, arXiv 2025 . [Paper ] [Code ] [Website ]
FastVGGT: Training-Free Acceleration of Visual Geometry Transformer, arXiv 2025 . [Paper ]
SAIL-Recon: Large SfM by Augmenting Scene Regression with Localization, arXiv 2025 . [Paper ] [Code ] [Website ]
Streaming 4D Visual Geometry Transformer, arXiv 2025 . [Paper ] [Code ] [Website ]
MapAnything: Universal Feed-Forward Metric 3D Reconstruction, arXiv 2025 . [Paper ] [Code ] [Website ]
TTT3R: 3D Reconstruction as Test-Time Training, arXiv 2025 . [Paper ] [Code ] [Website ]
Co-Me: Confidence Guided Token Merging for Visual Geometric Transformers, arXiv 2025 . [Paper ] [Code ] [Website ]
MB3R: Accurate Feed-forward Metric-scale 3D Reconstruction with Backend, arXiv 2025 . [Paper ]
VG3T: Visual Geometry Grounded Gaussian Transformer, arXiv 2025 . [Paper ]
KV-Tracker: Real-Time Pose Tracking with Transformers, arXiv 2025 . [Paper ] [Website ]
V-DPM: 4D Video Reconstruction with Dynamic Point Mapss, arXiv 2026 . [Paper ] [Website ] [Code ]
S-MUSt3R: Sliding Multi-view 3D Reconstruction, arXiv 2026 . [Paper ]
Flow3r: Factored Flow Prediction for Scalable Visual Geometry Learning, arXiv 2026 . [Paper ] [Website ] [Code ]
VGG-T3: Offline Feed-Forward 3D Reconstruction at Scale, CVPR 2026 . [Paper ]
OnlineX: Unified Online 3D Reconstruction and Understanding with Active-to-Stable State Evolution, CVPR Finding 2026 2026 . [Paper ] [Website ]
ZipMap: Linear-Time Stateful 3D Reconstruction with Test-Time Training, CVPR 2026 . [Paper ] [Code ] [Website ]
Dark3R: Learning Structure from Motion in the Dark, arXiv 2026 . [Paper ] [Website ]
Hier-slam++: Neuro-symbolic semantic slam with a hierarchically categorical gaussian splatting, arXiv 2025 . [Paper ]
MegaSaM: Accurate, Fast, and Robust Structure and Motion from Casual Dynamic Videos, CVPR 2025 . [Paper ] [Website ] [Code ]
SLAM3R: Real-Time Dense Scene Reconstruction from Monocular RGB Videos, CVPR 2025 . [Paper ] [Code ]
MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors, CVPR 2025 . [Paper ] [Website ] [Code ]
VGGT-SLAM: Dense RGB SLAM Optimized on the SL(4) Manifold, arXiv 2025 . [Paper ] [Code ]
Outdoor Monocular SLAM with Global Scale-Consistent 3D Gaussian Pointmaps, ICCV 2025 . [Paper ] [Website ] [Code ]
VGGT-Long: Chunk it, Loop it, Align it – Pushing VGGT’s Limits on Kilometer-scale Long RGB Sequences, arXiv 2025 . [Paper ] [Code ]
Pseudo Depth Meets Gaussian: A Feed-forward RGB SLAM Baseline, IROS, 2025 . [Paper ] [Code ]
3D Foundation Model-Based Loop Closing for Decentralized Collaborative SLAM, RAL, 2025 . [Paper ]
ViSTA-SLAM: Visual SLAM with Symmetric Two-view Association, 3DV, 2026 . [Paper ] [Code ]
SLAM-Former: Putting SLAM into One Transformer, arXiv 2025 . [Paper ] [Website ] [Code ]
MASt3R-Fusion: Integrating Feed-Forward Visual Model with IMU, GNSS for High-Functionality SLAM, arXiv, 2025 . [Paper ] [Code ]
GRS-SLAM3R: Real-Time Dense SLAM with Gated Recurrent State, arXiv, 2025 . [Paper ]
EC3R-SLAM: Efficient and Consistent Monocular Dense SLAM with Feed-Forward 3D Reconstruction, arXiv, 2025 . [Paper ] [Website ] [Code ]
Visual Odometry with Transformers, arXiv 2025 . [Paper ] [Website ] [Code ]
ARTDECO: Towards Efficient and High-Fidelity On-the-Fly 3D Reconstruction with Structured Scene Representation, arXiv, 2025 . [Paper ] [Website ] [Code ]
MASt3R-GS: Bridging 3D Reconstruction Priors with Gaussian Splatting for Real-Time Dense SLAM, IROSw, 2025 . [Paper ]
LiDAR-VGGT: Cross-Modal Coarse-to-Fine Fusion for Globally Consistent and Metric-Scale Dense Mapping, arXiv, 2025 . [Paper ]
Building temporally coherent 3D maps with VGGT for memory-efficient Semantic SLAM, arXiv, 2025 . [Paper ]
SING3R-SLAM: Submap-based Indoor Monocular Gaussian SLAM with 3D Reconstruction Prior, arXiv, 2025 . [Paper ]
KM-ViPE: Online Tightly Coupled Vision-Language-Geometry Fusion for Open-Vocabulary Semantic SLAM, arXiv, 2025 . [Paper ] [Code ]
Dynamic Visual SLAM using a General 3D Prior, arXiv, 2025 . [Paper ] [Code ]
OpenMonoGS-SLAM: Monocular Gaussian Splatting SLAM with Open-set Semantics, arXiv, 2025 . [Paper ]
Keyframe-Based Feed-Forward Visual Odometry, arXiv 2026 . [Paper ]
VGGT-SLAM 2.0: Real time Dense Feed-forward Scene Reconstruction, arXiv 2026 . [Paper ]
VGGT-Motion: Motion-Aware Calibration-Free Monocular SLAM for Long-Range Consistency, arXiv 2026 . [Paper ]
VGGT-based online 3D semantic SLAM for indoor scene understanding and navigation, arXiv 2026 . [Paper ]
VGGT-Geo: Probabilistic Geometric Fusion of Visual Geometry Grounded Transformer Priors for Robust Dense Indoor SLAM, ISPRS International Journal of Geo-Information 2026 . [Paper ]
IRIS-SLAM: Unified Geo-Instance Representations for Robust Semantic Localization and Mapping, arXiv 2026 . [Paper ]
AIM-SLAM: Dense Monocular SLAM via Adaptive and Informative Multi-View Keyframe Prioritization with Foundation Model, arXiv, 2026 . [Paper ] [Website ]
Calib3R: A 3D Foundation Model for Multi-Camera to Robot Calibration and 3D Metric-Scaled Scene Reconstruction, arXiv 2025 . [Paper ]
Loop Closure Detection & Place Recognition
Multi-modal Loop Closure Detection with Foundation Models in Severely Unstructured Environments, arXiv 2025 . [Paper ]
VGGT-MPR: VGGT-Enhanced Multimodal Place Recognition in Autonomous Driving Environments, arXiv 2026 . [Paper ]
UniPR-3D: Towards Universal Visual Place Recognition with Visual Geometry Grounded Transformer, arXiv 2025 . [Paper ] [Code ]
Reloc-VGGT: Visual Re-localization with Geometry Grounded Transformer, arXiv 2025 . [Paper ] [Code ]
SpatialLM: Training Large Language Models for Structured Indoor Modeling, NIPS 2025 . [Paper ] [Website ] [Code ]
About
This is a list of relevant papers for 3D Geometric Foundation Models and Applications.
Topics
Resources
Stars
Watchers
Forks
You can’t perform that action at this time.