Abstract:Simultaneous localization and mapping (SLAM) has achieved impressive performance in static environments. However, SLAM in dynamic environments remains an open question. Many methods directly filter out dynamic objects, resulting in incomplete scene reconstruction and limited accuracy of camera localization. The other works express dynamic objects by point clouds, sparse joints, or coarse meshes, which fails to provide a photo-realistic representation. To overcome the above limitations, we propose a photo-realistic and geometry-aware RGB-D SLAM method by extending Gaussian splatting. Our method is composed of three main modules to 1) map the dynamic foreground including non-rigid humans and rigid items, 2) reconstruct the static background, and 3) localize the camera. To map the foreground, we focus on modeling the deformations and/or motions. We consider the shape priors of humans and exploit geometric and appearance constraints of humans and items. For background mapping, we design an optimization strategy between neighboring local maps by integrating appearance constraint into geometric alignment. As to camera localization, we leverage both static background and dynamic foreground to increase the observations for noise compensation. We explore the geometric and appearance constraints by associating 3D Gaussians with 2D optical flows and pixel patches. Experiments on various real-world datasets demonstrate that our method outperforms state-of-the-art approaches in terms of camera localization and scene representation. Source codes will be publicly available upon paper acceptance.

What problem does this paper attempt to address?

This paper attempts to solve the problem of Simultaneous Localization and Mapping (SLAM) in dynamic environments. Specifically, traditional SLAM methods perform well in static environments but face many challenges in dynamic environments. The main problems include: 1. **Incomplete Scene Reconstruction**: Many existing methods directly filter out dynamic objects, resulting in the inability to reconstruct these objects and thus incomplete scene reconstruction. 2. **Limited Camera Localization Accuracy**: Due to the lack of information about dynamic objects, the camera's localization accuracy is affected, especially when dynamic objects dominate in the image. 3. **Lack of Realistic Representation**: Existing SLAM methods usually use point clouds, sparse joints or rough meshes when representing dynamic objects and cannot provide realistic visual effects. To solve these problems, this paper proposes an RGB - D SLAM method based on Gaussian Splatting, named PG - SLAM. This method aims to achieve the following goals: - **Reconstruct Dynamic Foreground**: Including non - rigid human bodies and rigid items, considering geometric priors and appearance constraints. - **Reconstruct Static Background**: Optimize the Gaussian distribution in the local map through multi - view appearance constraints to ensure accurate reconstruction of the background. - **Camera Localization**: Utilize the information of the static background and dynamic foreground, combine geometric and appearance constraints, and improve the accuracy of camera localization. ### Main Contributions 1. **Propose a SLAM method based on Gaussian Splatting for the first time**: It can not only localize the camera and reconstruct the static background, but also map dynamic human bodies and items. 2. **Provide a realistic representation of dynamic scenes**: For foreground mapping, the human body shape prior is considered, and geometric and appearance constraints are utilized; for background mapping, an effective optimization strategy is designed. 3. **Combine geometric and appearance constraints for camera localization**: By correlating 3D Gaussian distributions with 2D optical flow and pixel blocks, use the static background and dynamic foreground information to compensate for noise, significantly improving the localization accuracy. Experimental results show that this method outperforms the existing state - of - the - art methods on multiple real - world datasets, especially in camera localization and scene representation.

PG-SLAM: Photo-realistic and Geometry-aware RGB-D SLAM in Dynamic Environments

GS3LAM: Gaussian Semantic Splatting SLAM

RGB‐D SLAM with Moving Object Tracking in Dynamic Environments

GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting

DG-SLAM: Robust Dynamic Gaussian Splatting SLAM with Hybrid Pose Optimization

DGS-SLAM: A Fast and Robust RGBD SLAM in Dynamic Environments Combined by Geometric and Semantic Information

DM-SLAM: A Feature-Based SLAM System for Rigid Dynamic Scenes

Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular, Stereo, and RGB-D Cameras

Gaussian-LIC: Real-Time Photo-Realistic SLAM with Gaussian Splatting and LiDAR-Inertial-Camera Fusion

PLPD-SLAM: Point-Line-Plane-Based RGB-D SLAM for Dynamic Environments

DLD-SLAM: RGB-D Visual Simultaneous Localisation and Mapping in Indoor Dynamic Environments Based on Deep Learning

Amos-SLAM: an Anti-Dynamics Two-Stage RGB-D SLAM Approach

Robust and Efficient RGB-D SLAM in Dynamic Environments

Gaussian-SLAM: Photo-realistic Dense SLAM with Gaussian Splatting

GMP-SLAM: A Real-Time RGB-D SLAM in Dynamic Environments Using GPU Dynamic Points Detection Method

A Dynamic Scene Vision SLAM Method Incorporating Object Detection and Object Characterization

RGBD GS-ICP SLAM

RGBDS-SLAM: A RGB-D Semantic Dense SLAM Based on 3D Multi Level Pyramid Gaussian Splatting

Development of RGB-D simultaneous localization and mapping in dynamic environments based on motion removal and dense map reconstruction