DG-SLAM: Robust Dynamic Gaussian Splatting SLAM with Hybrid Pose Optimization

Yueming Xu,Haochen Jiang,Zhongyang Xiao,Jianfeng Feng,Li Zhang
2024-11-13
Abstract:Achieving robust and precise pose estimation in dynamic scenes is a significant research challenge in Visual Simultaneous Localization and Mapping (SLAM). Recent advancements integrating Gaussian Splatting into SLAM systems have proven effective in creating high-quality renderings using explicit 3D Gaussian models, significantly improving environmental reconstruction fidelity. However, these approaches depend on a static environment assumption and face challenges in dynamic environments due to inconsistent observations of geometry and photometry. To address this problem, we propose DG-SLAM, the first robust dynamic visual SLAM system grounded in 3D Gaussians, which provides precise camera pose estimation alongside high-fidelity reconstructions. Specifically, we propose effective strategies, including motion mask generation, adaptive Gaussian point management, and a hybrid camera tracking algorithm to improve the accuracy and robustness of pose estimation. Extensive experiments demonstrate that DG-SLAM delivers state-of-the-art performance in camera pose estimation, map reconstruction, and novel-view synthesis in dynamic scenes, outperforming existing methods meanwhile preserving real-time rendering ability.
Robotics
What problem does this paper attempt to address?
### Problems Addressed by the Paper This paper aims to address the robustness and accuracy issues in Visual Simultaneous Localization and Mapping (Visual SLAM) in dynamic scenes. Specifically, traditional SLAM systems usually assume that the environment is static, which limits their performance in practical applications, especially in environments with dynamic objects. These dynamic objects cause inconsistencies in geometric and photometric observations, affecting the accuracy of camera pose estimation. ### Main Contributions 1. **Proposed DG-SLAM**: This is the first robust dynamic visual SLAM system based on 3D Gaussian points, capable of real-time rendering and high-fidelity reconstruction. 2. **Advanced Motion Mask Generation Strategy**: By combining spatiotemporal consistency depth masks and semantic priors, the accuracy of dynamic object segmentation is significantly improved. 3. **Hybrid Camera Tracking Strategy**: Utilizing a coarse-to-fine pose optimization algorithm, it enhances the consistency and accuracy between estimated poses and reconstructed maps. 4. **Adaptive Gaussian Point Addition and Pruning Strategy**: Ensures the integrity of geometric structures, supporting accurate camera tracking. 5. **Extensive Experimental Validation**: Conducted numerous experiments on two challenging dynamic datasets and a common static dataset, demonstrating the advanced performance of the system. ### Solutions - **3D Gaussian Point Representation**: Represents the scene as a set of 3D Gaussian ellipsoids, each containing geometric and appearance attributes. - **Motion Mask Generation**: Generates accurate motion masks through depth warping operations and semantic segmentation, filtering out dynamic regions. - **Coarse-to-Fine Camera Tracking**: Uses DROID-SLAM's visual odometry (VO) to provide initial pose estimates and further refines them through a hybrid optimization algorithm. - **Adaptive Gaussian Point Management**: Dynamically adjusts the density of Gaussian points, ensuring the efficiency and accuracy of scene representation. - **Dense Bundle Adjustment**: Performs dense bundle adjustment after tracking to eliminate accumulated errors. ### Experimental Results - **Quantitative Evaluation**: In multiple dynamic scene sequences, DG-SLAM outperforms existing methods in terms of pose estimation accuracy and map reconstruction integrity. - **Qualitative Evaluation**: The generated static maps can be rendered with high fidelity, demonstrating the effectiveness of the method. ### Conclusion The proposed DG-SLAM system achieves robust and accurate camera pose estimation and high-quality map reconstruction in dynamic environments, providing strong support for autonomous navigation of mobile robots in complex dynamic environments.