Abstract:Recently, graph-based and Transformer-based deep learning networks have demonstrated excellent performances on various point cloud tasks. Most of the existing graph methods are based on static graph, which take a fixed input to establish graph relations. Moreover, many graph methods apply maximization and averaging to aggregate neighboring features, so that only a single neighboring point affects the feature of centroid or different neighboring points have the same influence on the centroid's feature, which ignoring the correlation and difference between points. Most Transformer-based methods extract point cloud features based on global attention and lack the feature learning on local neighbors. To solve the problems of these two types of models, we propose a new feature extraction block named Graph Transformer and construct a 3D point point cloud learning network called GTNet to learn features of point clouds on local and global patterns. Graph Transformer integrates the advantages of graph-based and Transformer-based methods, and consists of Local Transformer and Global Transformer modules. Local Transformer uses a dynamic graph to calculate all neighboring point weights by intra-domain cross-attention with dynamically updated graph relations, so that every neighboring point could affect the features of centroid with different weights; Global Transformer enlarges the receptive field of Local Transformer by a global self-attention. In addition, to avoid the disappearance of the gradient caused by the increasing depth of network, we conduct residual connection for centroid features in GTNet; we also adopt the features of centroid and neighbors to generate the local geometric descriptors in Local Transformer to strengthen the local information learning capability of the model. Finally, we use GTNet for shape classification, part segmentation and semantic segmentation tasks in this paper.

TCPNet: A 3D Point Cloud Classification Model Based on the Advanced CNN and Transformer

3DCTN: 3D Convolution-Transformer Network for Point Cloud Classification

Learning point cloud context information based on 3D transformer for more accurate and efficient classification

Sewer defect detection from 3D point clouds using a transformer-based deep learning model

Point Cloud Classification Using Content-based Transformer via Clustering in Feature Space

UFO-Net: A Linear Attention-Based Network for Point Cloud Classification

VTPNet for 3D deep learning on point cloud

EGCT: Enhanced Graph Convolutional Transformer for 3D Point Cloud Representation Learning

Learning Point Cloud Shapes with Geometric and Topological Structures.

PointCAT: Cross-Attention Transformer for point cloud

PU-CTG: A Point Cloud Upsampling Network Using Transformer Fusion and GRU Correction

GTNet: Graph Transformer Network for 3D Point Cloud Classification and Semantic Segmentation

Deep Neural Network for Point Sets Based on Local Feature Integration

3D Point Cloud Classification Segmentation Model Based on Improved PointNet++

RS-TNet: point cloud transformer with relation-shape awareness for fine-grained 3D visual processing

DGC-TnT: Enhancing Point Cloud Object Classification by Dynamic Graph Convolutions With Transformer in Transformer

MPCT: Multiscale Point Cloud Transformer with a Residual Network

TCNet: Multiscale Fusion of Transformer and CNN for Semantic Segmentation of Remote Sensing Images

Multi-view attention-convolution pooling network for 3D point cloud classification

3DPCTN: Two 3D Local-Object Point-Cloud-Completion Transformer Networks Based on Self-Attention and Multi-Resolution