Abstract:Point cloud registration is an essential technology in computer vision and robotics. Recently, transformer-based methods have achieved advanced performance in point cloud registration by utilizing the advantages of the transformer in order-invariance and modeling dependencies to aggregate information. However, they still suffer from indistinct feature extraction, sensitivity to noise, and outliers, owing to three major limitations: 1) the adoption of CNNs fails to model global relations due to their local receptive fields, resulting in extracted features susceptible to noise; 2) the shallow-wide architecture of transformers and the lack of positional information lead to indistinct feature extraction due to inefficient information interaction; and 3) the insufficient consideration of geometrical compatibility leads to the ambiguous identification of incorrect correspondences. To address the above-mentioned limitations, a novel full transformer network for point cloud registration is proposed, named the deep interaction transformer (DIT), which incorporates: 1) a point cloud structure extractor (PSE) to retrieve structural information and model global relations with the local feature integrator (LFI) and transformer encoders; 2) a deep-narrow point feature transformer (PFT) to facilitate deep information interaction across a pair of point clouds with positional information, such that transformers establish comprehensive associations and directly learn the relative position between points; and 3) a geometric matching-based correspondence confidence evaluation (GMCCE) method to measure spatial consistency and estimate correspondence confidence by the designed triangulated descriptor. Extensive experiments on the ModelNet40, ScanObjectNN, and 3DMatch datasets demonstrate that our method is capable of precisely aligning point clouds, consequently, achieving superior performance compared with state-of-the-art methods. The code is publicly available at https://github.com/CGuangyan-BIT/DIT.

PointDKT: Dual-Key Transformer for Point Cloud

3DPCT: 3D Point Cloud Transformer with Dual Self-attention

Point Tree Transformer for Point Cloud Registration

APPT : Asymmetric Parallel Point Transformer for 3D Point Cloud Understanding

PointCAT: Cross-Attention Transformer for point cloud

Point Transformer V3: Simpler, Faster, Stronger

PCT: Point cloud transformer

EGCT: Enhanced Graph Convolutional Transformer for 3D Point Cloud Representation Learning

Deep Interactive Full Transformer Framework for Point Cloud Registration.

Fast and Robust Point Cloud Registration with Tree-based Transformer

Adaptive Point Transformer

PointMT: Efficient Point Cloud Analysis with Hybrid MLP-Transformer Architecture

PointTr: Low-Overlap Point Cloud Registration with Transformer

3DCTN: 3D Convolution-Transformer Network for Point Cloud Classification

Full Transformer Framework for Robust Point Cloud Registration With Deep Information Interaction

Point Transformer

PVT: Point-Voxel Transformer for Point Cloud Learning

DcTr: Noise-robust Point Cloud Completion by Dual-Channel Transformer with Cross-Attention

Pyramid Point Cloud Transformer for Large-Scale Place Recognition.

Hierarchical local global transformer for point clouds analysis

D2T-Net: A dual-domain transformer network exploiting spatial and channel dimensions for semantic segmentation of urban mobile laser scanning point clouds