Point Tree Transformer for Point Cloud Registration

Meiling Wang,Guangyan Chen,Yi Yang,Li Yuan,Yufeng Yue
2024-06-25
Abstract:Point cloud registration is a fundamental task in the fields of computer vision and robotics. Recent developments in transformer-based methods have demonstrated enhanced performance in this domain. However, the standard attention mechanism utilized in these methods often integrates many low-relevance points, thereby struggling to prioritize its attention weights on sparse yet meaningful points. This inefficiency leads to limited local structure modeling capabilities and quadratic computational complexity. To overcome these limitations, we propose the Point Tree Transformer (PTT), a novel transformer-based approach for point cloud registration that efficiently extracts comprehensive local and global features while maintaining linear computational complexity. The PTT constructs hierarchical feature trees from point clouds in a coarse-to-dense manner, and introduces a novel Point Tree Attention (PTA) mechanism, which follows the tree structure to facilitate the progressive convergence of attended regions towards salient points. Specifically, each tree layer selectively identifies a subset of key points with the highest attention scores. Subsequent layers focus attention on areas of significant relevance, derived from the child points of the selected point set. The feature extraction process additionally incorporates coarse point features that capture high-level semantic information, thus facilitating local structure modeling and the progressive integration of multiscale information. Consequently, PTA empowers the model to concentrate on crucial local structures and derive detailed local information while maintaining linear computational complexity. Extensive experiments conducted on the 3DMatch, ModelNet40, and KITTI datasets demonstrate that our method achieves superior performance over the state-of-the-art methods.
Computer Vision and Pattern Recognition,Robotics
What problem does this paper attempt to address?
The paper primarily focuses on the fundamental task of point cloud registration in the fields of computer vision and robotics. Specifically, the paper attempts to address the following issues: 1. **Limitations of existing methods**: Traditional methods such as Iterative Closest Point (ICP) are prone to getting stuck in local optima. While learning-based methods can extract features through neural networks to establish correspondences between point clouds, they still face obstacles when dealing with cross-point cloud structures. Additionally, traditional attention mechanisms struggle to effectively allocate weights to key but sparse points, resulting in limited local structure modeling capabilities and high computational complexity. 2. **Proposing a novel transformer model**: To address the above issues, the authors propose a new method called "Point Tree Transformer (PTT)." This method efficiently extracts local and global features by constructing a hierarchical feature tree while maintaining linear computational complexity. A key component of this method is the Point Tree Attention (PTA) mechanism, which dynamically focuses on important local structures, thereby improving the efficiency and quality of feature extraction. 3. **Optimizing the attention mechanism**: Existing local attention mechanisms typically rely on predefined patterns, which limits their applicability and effectiveness in cross-point cloud scenarios. PTA, on the other hand, avoids integrating low-relevance points by gradually converging the focus area, and it can achieve cross-attention mechanism effectiveness without the need for predefined patterns. 4. **Experimental validation**: The paper demonstrates the superior performance of the proposed PTT method compared to existing state-of-the-art (SOTA) techniques through extensive experiments on the 3DMatch, ModelNet40, and KITTI datasets. The results show that the PTT method not only improves registration accuracy but also maintains efficient computational characteristics. In summary, this paper aims to overcome the limitations of current point cloud registration methods, particularly in terms of local structure modeling capabilities and computational efficiency, by introducing a new transformer architecture—the Point Tree Transformer.