EC-SLAM: Effectively Constrained Neural RGB-D SLAM with Sparse TSDF Encoding and Global Bundle Adjustment

Guanghao Li,Qi Chen,YuXiang Yan,Jian Pu
2024-10-18
Abstract:We introduce EC-SLAM, a real-time dense RGB-D simultaneous localization and mapping (SLAM) system leveraging Neural Radiance Fields (NeRF). While recent NeRF-based SLAM systems have shown promising results, they have yet to fully exploit NeRF's potential to constrain pose optimization. EC-SLAM addresses this by using sparse parametric encodings and Truncated Signed Distance Fields (TSDF) to represent the map, enabling efficient fusion, reducing model parameters, and accelerating convergence. Our system also employs a globally constrained Bundle Adjustment (BA) strategy that capitalizes on NeRF's implicit loop closure correction capability, improving tracking accuracy by reinforcing constraints on keyframes most relevant to the current optimized frame. Furthermore, by integrating a feature-based and uniform sampling strategy that minimizes ineffective constraint points for pose optimization, we reduce the impact of random sampling in NeRF. Extensive evaluations on the Replica, ScanNet, and TUM datasets demonstrate state-of-the-art performance, with precise tracking and reconstruction accuracy achieved alongside real-time operation at up to 21 Hz.
Robotics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is that the existing NeRF - based RGB - D SLAM systems fail to fully utilize the potential of NeRF when optimizing poses and reconstructing maps. Specifically, traditional methods perform poorly in creating dense and continuous maps, and when dealing with large - scale scenes, there are problems such as slow convergence, excessive parameters, and the impact of random sampling on pose optimization. ### Specific description of the problem 1. **Limitations of traditional VSLAM systems**: - Although traditional VSLAM systems perform well in terms of positioning accuracy, they are insufficient in creating dense and continuous maps. - Dense maps are crucial for advanced applications (such as autonomous driving and virtual reality), so further research is required to meet these complex needs. 2. **Challenges of NeRF - based SLAM systems**: - Although NeRF - based SLAM systems perform well in real - time dense mapping, their pose - tracking accuracy is not as good as that of traditional VSLAM systems. - These systems face challenges in pose constraints, especially in large - scale scenes. Training all parameters will lead to slow convergence and is prone to catastrophic forgetting. - The random sampling method affects the accuracy of pose optimization in the continuous tracking of joint local/global BA and new key - frames. ### Solutions of EC - SLAM To solve the above problems, EC - SLAM introduces a new NeRF - based RGB - D SLAM system, which is mainly improved in the following ways: 1. **Sparse parameter encoding combined with TSDF**: - Use sparse parameter encoding and Truncated Signed Distance Fields (TSDF) to represent maps, thereby achieving efficient fusion, reducing model parameters, and accelerating convergence. 2. **Bundle Adjustment (BA) strategy with global constraints**: - Introduce a BA strategy with global constraints, utilize the implicit loop - closure correction ability of NeRF, strengthen the constraints on the key - frames most relevant to the currently optimized frame, and improve the tracking accuracy. 3. **Robust pixel sampling method**: - Adopt a combined method of feature - point sampling and uniform sampling to reduce the impact of random sampling on pose optimization and ensure more stable tracking and reconstruction effects. 4. **Multi - thread optimization**: - Through multi - thread optimization, fast and accurate map and pose optimization are achieved, and real - time operation (up to 21 Hz) can be realized while maintaining high accuracy. ### Summary By combining sparse parameter encoding, TSDF representation, BA strategy with global constraints, and robust pixel sampling method, EC - SLAM solves the deficiencies of existing NeRF - based SLAM systems in pose optimization and map reconstruction, and achieves more accurate tracking and high - quality reconstruction.