DeepPointMap2: Accurate and Robust LiDAR-Visual SLAM with Neural Descriptors

Xiaze Zhang,Ziheng Ding,Qi Jing,Ying Cheng,Wenchao Ding,Rui Feng
DOI: https://doi.org/10.1145/3664647.3681519
2024-01-01
Abstract:Simultaneous Localization and Mapping (SLAM) plays a pivotal role in autonomous driving and robotics. Existing methods often rely on hand-craft feature extraction and cross-modal fusion techniques, resulting in limited feature representation capability and reduced robustness. To address this challenge, we introduce DeepPointMap2, a novel learning-based LiDAR-Visual SLAM architecture that leverages neural descriptors to tackle multiple SLAM sub-tasks in a unified manner. Our approach employs neural networks to extract multi-modal tokens, which are then adaptively fused by the Visual-Point Fusion Module to generate sparse 3D neural descriptors, ensuring precise and robust performance. As a pioneering work, our method achieves state-of-the-art localization performance among various Visual-, LiDAR-, and Visual-LiDAR-based methods in widely-used benchmarks, as shown in the experiment results. Furthermore, the approach proves to be robust in scenarios involving camera failure and LiDAR obstruction.
What problem does this paper attempt to address?