SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM

Nikhil Keetha,Jay Karhade,Krishna Murthy Jatavallabhula,Gengshan Yang,Sebastian Scherer,Deva Ramanan,Jonathon Luiten

2024-04-16

Abstract:Dense simultaneous localization and mapping (SLAM) is crucial for robotics and augmented reality applications. However, current methods are often hampered by the non-volumetric or implicit way they represent a scene. This work introduces SplaTAM, an approach that, for the first time, leverages explicit volumetric representations, i.e., 3D Gaussians, to enable high-fidelity reconstruction from a single unposed RGB-D camera, surpassing the capabilities of existing methods. SplaTAM employs a simple online tracking and mapping system tailored to the underlying Gaussian representation. It utilizes a silhouette mask to elegantly capture the presence of scene density. This combination enables several benefits over prior representations, including fast rendering and dense optimization, quickly determining if areas have been previously mapped, and structured map expansion by adding more Gaussians. Extensive experiments show that SplaTAM achieves up to 2x superior performance in camera pose estimation, map construction, and novel-view synthesis over existing methods, paving the way for more immersive high-fidelity SLAM applications.

Computer Vision and Pattern Recognition,Artificial Intelligence,Robotics

What problem does this paper attempt to address?

The paper aims to address key issues in dense Simultaneous Localization and Mapping (SLAM), particularly focusing on the shortcomings of current methods in scene representation. Specifically, existing SLAM methods are often limited by non-volumetric or implicit scene representations, leading to performance bottlenecks when dealing with complex real-world environments. To solve these problems, the paper proposes a new method called SplaTAM, which for the first time utilizes explicit volumetric representation (i.e., 3D Gaussian distributions) to achieve high-fidelity reconstruction, surpassing the capabilities of existing methods. SplaTAM achieves high-precision camera tracking and high-fidelity map reconstruction by online optimizing explicit volumetric representations (3D Gaussian distributions) combined with differentiable rendering techniques. Compared to existing explicit and implicit representations, this method has the following advantages: 1. **Fast Rendering and Dense Optimization**: 3D Gaussian distributions can be rendered into images at speeds of up to 400 frames per second, significantly faster than implicit and volumetric alternatives. 2. **Maps with Clear Spatial Extents**: By rendering silhouette masks, it is easy to identify existing parts of the scene, efficiently recognizing new content in new views, and allowing for easy map updates. 3. **Direct Optimization of Scene Parameters**: Since the scene is represented by Gaussians with physical locations, colors, and sizes, nearly linear gradient flow can be achieved between parameters and dense photometric loss, enabling rapid optimization. Experimental results show that SplaTAM achieves significantly better performance than existing methods across multiple datasets, particularly in camera pose estimation, map construction, and novel view synthesis, paving the way for more immersive high-fidelity SLAM applications.

SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM

GS3LAM: Gaussian Semantic Splatting SLAM

GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting

High-Fidelity SLAM Using Gaussian Splatting with Rendering-Guided Densification and Regularized Optimization

Gaussian-SLAM: Photo-realistic Dense SLAM with Gaussian Splatting

IG-SLAM: Instant Gaussian SLAM

Gaussian Splatting SLAM

RGBD GS-ICP SLAM

PG-SLAM: Photo-realistic and Geometry-aware RGB-D SLAM in Dynamic Environments

Compact 3D Gaussian Splatting For Dense Visual SLAM

MM3DGS SLAM: Multi-modal 3D Gaussian Splatting for SLAM Using Vision, Depth, and Inertial Measurements

FlashSLAM: Accelerated RGB-D SLAM for Real-Time 3D Scene Reconstruction with Gaussian Splatting

LoopSplat: Loop Closure by Registering 3D Gaussian Splats

SemGauss-SLAM: Dense Semantic Gaussian Splatting SLAM

RTG-SLAM: Real-time 3D Reconstruction at Scale using Gaussian Splatting

DG-SLAM: Robust Dynamic Gaussian Splatting SLAM with Hybrid Pose Optimization

MGS-SLAM: Monocular Sparse Tracking and Gaussian Mapping with Depth Smooth Regularization

GLC-SLAM: Gaussian Splatting SLAM with Efficient Loop Closure

Towards Real-Time Gaussian Splatting: Accelerating 3DGS through Photometric SLAM

Visual SLAM with 3D Gaussian Primitives and Depth Priors Enabling Novel View Synthesis