Learning point cloud context information based on 3D transformer for more accurate and efficient classification

Yiping Chen,Shuai Zhang,Weisheng Lin,Shuhang Zhang,Wuming Zhang
DOI: https://doi.org/10.1111/phor.12469
2023-12-11
The Photogrammetric Record
Abstract:The figure shows the pipeline of point cloud classification which is similar to PointNet. T‐Net is used to eliminate the effect of point cloud rotation and a 3D transformer module is utilised to learn the point cloud context information. Finally, the MLP is utilised to map to the category dimension. Experiments show that our method is accurate and efficient. The point cloud semantic understanding task has made remarkable progress along with the development of 3D deep learning. However, aggregating spatial information to improve the local feature learning capability of the network remains a major challenge. Many methods have been used for improving local information learning, such as constructing a multi‐area structure for capturing different area information. However, it will lose some local information due to the independent learning point feature. To solve this problem, a new network is proposed that considers the importance of the differences between points in the neighbourhood. Capturing local feature information can be enhanced by highlighting the different feature importance of the point cloud in the neighbourhood. First, T‐Net is constructed to learn the point cloud transformation matrix for point cloud disorder. Second, transformer is used to improve the problem of local information loss due to the independence of each point in the neighbourhood. The experimental results show that 92.2% accuracy overall was achieved on the ModelNet40 dataset and 93.8% accuracy overall was achieved on the ModelNet10 dataset.
geosciences, multidisciplinary,geography, physical,remote sensing,imaging science & photographic technology
What problem does this paper attempt to address?