End-to-end point cloud registration with transformer

Yong Wang,Pengbo Zhou,Guohua Geng,Li An,Qi Zhang
DOI: https://doi.org/10.1007/s10462-024-10985-y
IF: 9.588
2024-11-28
Artificial Intelligence Review
Abstract:With the widespread application of large-scale 3D point cloud data in real-world scenarios, efficient and accurate point cloud registration has become a crucial challenge. We propose an end-to-end point cloud registration method based on the Transformer architecture. This method addresses the issues of low overlap and registration in large scenes, exhibiting strong algorithmic versatility and efficiency. We introduce a combination of dynamic position encoding and ternary angular position encoding within the Transformer, effectively enhancing the representation capability of point cloud data and algorithmic generality, thus better tackling point cloud registration challenges in large scenes. Additionally, to enhance the learning capacity of the attention mechanism, we employ an improved cross-attention mechanism that multiplies the softmax with adaptive weights, enabling the model to capture key information within the point cloud more accurately. In the decoding stage, we introduce a multi-scale feature fusion approach that fully exploits the multi-layer information in point cloud data, further improving registration accuracy and robustness. Through the fusion of multi-scale features, we effectively mitigate information loss and handle matching problems between point clouds of varying sizes. Experimental results demonstrate the excellence of our method in addressing low overlap and registration tasks in large scenes, validated across multiple datasets including 3DMatch, ModelNet, KITTI, and MVP-RG.
computer science, artificial intelligence
What problem does this paper attempt to address?