Deep Interactive Full Transformer Framework for Point Cloud Registration.

Guangyan Chen,Meiling Wang,Qingxiang Mang,Li Yuan,Tong Liu,Yufeng Yue
DOI: https://doi.org/10.1109/icra48891.2023.10160863
2023-01-01
Abstract:Point cloud registration is a crucial technology in the fields of robotics and computer vision. Despite the significant advances in point cloud registration enabled by Transformer-based methods, limitations persist due to indistinct feature extraction, noise sensitivity, and outlier handling. These limitations stem from three factors: (1) the inefficiency of convolutional neural networks (CNNs) to capture global relationships due to their local receptive fields, resulting in extracted features susceptible to noise; (2) the shallow-wide architecture of Transformers, coupled with a lack of positional information, leading to inefficient information interaction and indistinct feature extraction; and (3) the omission of geometrical compatibility leads to ambiguous identification of incorrect correspondences. To overcome these limitations, we propose the Deep Interactive Full Transformer (DIFT) network for point cloud registration, which consists of three key components: (1) a Point Cloud Structure Extractor (PSE) for modeling global relationships and retrieving structural information; (2) a Point Feature Transformer (PFT) for establishing comprehensive associations and directly learning the relative positions between points; and (3) a Geometric Matching-based Correspondence Confidence Evaluation (GMCCE) method for measuring spatial consistency and estimating correspondence confidence. Experimental results on ModelNet40 and 3DMatch datasets demonstrate the superior performance of our proposed method compared to existing state-of-the-art methods. The code for our method is publicly available at https://github.com/CGuangyan-BIT/DIFT.
What problem does this paper attempt to address?