Flexible asymmetric convolutional attention network for LiDAR semantic

Jianwang Gan,Guoying Zhang,Kangkang Kou,Yijing Xiong
DOI: https://doi.org/10.1007/s10489-024-05525-8
IF: 5.3
2024-05-24
Applied Intelligence
Abstract:LiDAR semantic segmentation is an essential task in understanding 3D semantic information. Currently, the most efficient approach to LiDAR data segmentation is to project the point cloud into the 2D plane and process it using 2D convolution. The results of this approach are encouraging. However, the elevation angle of LiDAR is larger than the azimuth angle, resulting in the range map being vertically elongated in the 3D space captured per unit pixel area. If a square convolution kernel is used, the extracted features will be distorted. To address these limitations, we propose the flexible asymmetric convolutional attention network (FACANet), built from flexible asymmetric convolution and lightweight decoding modules. In this encoder structure, a meta-kernel accounts for the geometric information in 3D space, which helps encode the input range image features effectively. Moreover, a flexible asymmetric convolutional attention block (FACAB) is proposed to capture elongated features in the range image. To facilitate lightweight decoding, the channel uniform interpolation block (CUIB) uses convolutions to reduce channels and bilinear interpolation to upsample features at each resolution. Furthermore, the continuous multiscale feature fusion block (CMFB) is proposed to fuse features at different resolutions. Finally, a convolutional spatial propagation network (CSPN)-based segmentation head is introduced to improve the accuracy of the segmentation results. Quantitative and qualitative experiments are conducted on the public datasets SemanticKITTI and SemanticPOSS, and our approach achieves better accuracy than advanced models.
computer science, artificial intelligence
What problem does this paper attempt to address?