NeSLAM: Neural Implicit Mapping and Self-Supervised Feature Tracking With Depth Completion and Denoising

Tianchen Deng,Yanbo Wang,Hongle Xie,Hesheng Wang,Jingchuan Wang,Danwei Wang,Weidong Chen
2024-03-29
Abstract:In recent years, there have been significant advancements in 3D reconstruction and dense RGB-D SLAM systems. One notable development is the application of Neural Radiance Fields (NeRF) in these systems, which utilizes implicit neural representation to encode 3D scenes. This extension of NeRF to SLAM has shown promising results. However, the depth images obtained from consumer-grade RGB-D sensors are often sparse and noisy, which poses significant challenges for 3D reconstruction and affects the accuracy of the representation of the scene geometry. Moreover, the original hierarchical feature grid with occupancy value is inaccurate for scene geometry representation. Furthermore, the existing methods select random pixels for camera tracking, which leads to inaccurate localization and is not robust in real-world indoor environments. To this end, we present NeSLAM, an advanced framework that achieves accurate and dense depth estimation, robust camera tracking, and realistic synthesis of novel views. First, a depth completion and denoising network is designed to provide dense geometry prior and guide the neural implicit representation optimization. Second, the occupancy scene representation is replaced with Signed Distance Field (SDF) hierarchical scene representation for high-quality reconstruction and view synthesis. Furthermore, we also propose a NeRF-based self-supervised feature tracking algorithm for robust real-time tracking. Experiments on various indoor datasets demonstrate the effectiveness and accuracy of the system in reconstruction, tracking quality, and novel view synthesis.
Computer Vision and Pattern Recognition,Robotics
What problem does this paper attempt to address?
The problems that this paper attempts to solve mainly focus on two aspects: 1. **Sparsity and Noise in Depth Images**: Depth images obtained by consumer - grade RGB - D sensors are usually sparse and noisy, which poses significant challenges to neural implicit mapping, affecting the accuracy of 3D reconstruction and the quality of scene geometry representation. The paper proposes a depth completion and denoising network, aiming to generate dense and accurate depth images and provide depth uncertainty images to enhance geometric representation ability and improve the overall performance of the system. 2. **Robustness and Accuracy of Camera Tracking in Actual Indoor Environments**: Existing methods adopt random strategies when selecting pixels for camera tracking, resulting in inaccurate and unstable tracking in complex real - world indoor environments. For this reason, the paper proposes a self - supervised feature tracking algorithm based on NeRF for achieving accurate and robust camera tracking in large and complex indoor environments. By solving the above two problems, the paper proposes the NeSLAM system, which can achieve accurate depth estimation, robust camera tracking and high - quality novel view synthesis. Specifically, the main contributions of the NeSLAM system include: - **A New Dense Visual SLAM System**: Using hierarchical implicit scene representation, this system is scalable, predictive and robust to complex indoor environments. It is an end - to - end, incrementally optimized method that supports the generation of realistic novel views and accurate 3D meshes. - **Depth Completion and Denoising Network**: A depth completion and denoising network is designed to generate dense and accurate depth images and their depth uncertainty images. These geometric prior information is used to guide the point sampling process and improve geometric consistency. - **A NeRF - Based Self - supervised Feature Tracking Method**: A self - supervised feature tracking method based on NeRF is proposed for achieving accurate and robust camera tracking in large and complex indoor environments. Experimental results prove its effectiveness and robustness. Through these innovations, the NeSLAM system has demonstrated superior performance in reconstruction, tracking quality and novel view synthesis in experiments on multiple indoor datasets.