SAR-SLAM: Self-Attentive Rendering-based SLAM with Neural Point Cloud Encoding

Xudong Lv,Zhiwei He,Yuxiang Yang,Jiahao Nie,Jing Zhang
DOI: https://doi.org/10.1145/3664647.3680696
2024-01-01
Abstract:Neural implicit representations have recently revolutionized simultaneous localization and mapping (SLAM), giving rise to a groundbreaking paradigm known as NeRF-based SLAM. However, existing methods often fall short in accurately estimating poses and reconstructing scenes. This limitation largely stems from their reliance on volume rendering techniques, which oversimplify the modeling process. In this paper, we introduce a novel neural implicit SLAM system named SAR-SLAM to address these shortcomings. Our approach reconstructs Neural Radiance Fields (NeRFs) using a self-attentive architecture and represents scenes through neural point cloud encoding. Unlike previous NeRF-based SLAM methods, which depend on traditional volume rendering equations for scene representation and view synthesis, our method employs a self-attentive rendering framework with the Transformer architecture during mapping and tracking stages. To enable incremental mapping, we anchor scene features within a neural point cloud, striking a balance between estimation accuracy and computational cost. Experimental results on three challenging datasets show the superior performance and robustness of our SAR-SLAM compared to recent NeRF-based SLAM systems. The code will be released.
What problem does this paper attempt to address?