RS-TNet: point cloud transformer with relation-shape awareness for fine-grained 3D visual processing

Xu Wang,Yuqiao Zeng,Yi Jin,Yigang Cen,Baifu Liu,Shaohua Wan
DOI: https://doi.org/10.1007/s00500-022-07543-5
IF: 3.732
2022-11-02
Soft Computing
Abstract:Point cloud representation is a challenge to extracting sufficient semantic information while ensuring that the sparsely point cloud spatial structure is complete. Benefiting from the Transformer network, recent studies have promoted the development of point cloud representation by extracting refined attention features based on global context. However, there is still undesired semantic information loss in the feature extraction stage. Hence, this paper proposes a novel architecture for 3D point cloud representation, namely Relation-Shape Transformer Network (RS-TNet), to address above problem while maintaining the merits of relation-shape embedding mechanism so as to generate rich and robust local semantic features. Specifically, RS-TNet can achieve coarse-to-fine grained semantic information coverage by integrating the global multi-head self-attention and local Relation-Feature extraction module simultaneously. Moreover, theoretical analysis demonstrates that RS-TNet can explicitly introduce the spatial relation of points by learning underlying shapes. In this way, extracted features are of more shape awareness and robustness. As a result, the proposed RS-TNet achieves 90.9% class accuracy and 85.6% Intersection-over-Union on ModelNet40 and ShapeNet datasets, respectively. Further, ablation experiments verify the effectiveness of our RS-TNet in point cloud classification and part segmentation tasks.
computer science, artificial intelligence, interdisciplinary applications
What problem does this paper attempt to address?