C2L-PR: Cross-modal Camera-to-LiDAR Place Recognition Via Modality Alignment and Orientation Voting

Huaiyuan Xu,Huaping Liu,Shoudong Huang,Yuxiang Sun
DOI: https://doi.org/10.1109/tiv.2024.3423392
IF: 8.2
2024-01-01
IEEE Transactions on Intelligent Vehicles
Abstract:Place recognition is a fundamental technology for vehicle localization. LiDAR-based methods could work under visual appearance-changing conditions, such as season or weather changes, and different times of a day. However, these methods require every vehicle to be equipped with a 3-D LiDAR during the online localization stage, resulting in high costs for the vehicles. To alleviate this issue, we propose a cross-modal place recognition network, which can localize vehicles with visual images obtained from a low-cost monocular camera against a pre-built LiDAR point-cloud database. To this end, we first bridge the modality gap between visual images and point clouds via modality alignment. Then, we propose an orientation voting module to suppress the recognition ambiguity caused by the inconsistent field-of-view between images and point clouds, thereby improving the place recognition accuracy. Experiments are conducted with three public datasets: KITTI, KITTI-360, and Oxford RobotCar, covering over 71.6 KM of vehicle trajectories in 12 urban and suburban regions in two countries. The results demonstrate the superiority of our network.
What problem does this paper attempt to address?