EGCT: Enhanced Graph Convolutional Transformer for 3D Point Cloud Representation Learning

Gang Chen,Wenju Wang,Haoran Zhou,Xiaolin Wang
DOI: https://doi.org/10.1007/s00371-024-03600-2
IF: 2.835
2024-01-01
The Visual Computer
Abstract:It is an urgent problem of high-precision 3D environment perception to carry out representation learning on point cloud data, which complete the synchronous acquisition of local and global feature information. However, current representation learning methods either only focus on how to efficiently learn local features, or capture long-distance dependencies but lose the fine-grained features. Therefore, we explore transformer on topological structures of point cloud graphs, proposing an enhanced graph convolutional transformer (EGCT) method. EGCT construct graph topology for disordered and unstructured point cloud. Then it uses the enhanced point feature representation method to further aggregate the feature information of all neighborhood points, which can compactly represent the features of this local neighborhood graph. Subsequent process, the graph convolutional transformer simultaneously performs self-attention calculations and convolution operations on the point coordinates and features of the neighborhood graph. It efficiently utilizes the spatial geometric information of point cloud objects. Therefore, EGCT learns comprehensive geometric information of point cloud objects, which can help to improve segmentation and classification accuracy. On the ShapeNetPart and ModelNet40 datasets, our EGCT method achieves a mIoU of 86.8 https://github.com/shepherds001/EGCT .
What problem does this paper attempt to address?