A Transformer-based Real-time LiDAR Semantic Segmentation Method for Restricted Mobile Devices
Chang Liu,Jin Zhao,Nianyi Sun
DOI: https://doi.org/10.1016/j.jfranklin.2024.01.033
IF: 4.246
2024-01-29
Journal of the Franklin Institute
Abstract:In scene understanding, LiDAR-based semantic segmentation is a crucial task for describing object boundaries and sizes. Simultaneously, real-time characteristics in autonomous driving heavily rely on 3D information for navigation. With these considerations in mind, we propose a novel LiDAR real-time semantic segmentation method, which involves projecting 3D point clouds into a spherical range image and performing segmentation using 2D convolution. Building upon the success of the Transformer on 2D images, we explore its potential on 3D point clouds. To leverage the advantages of both convolution and Transformer, we introduce the Multi-Head Self-Attention (MHSA) mechanism into LiDAR semantic segmentation, as a means to enhance 2D Convolution. This results in a lightweight model with three key insights: i) Proposing a parallel semantic segmentation architecture by combining Transformer and convolution; ii) Innovatively splitting channels to differentiate the Transformer and convolution branches; iii) The concept of adaptive sliding window is introduced to enhance the relationship of edge dependency when projecting 3D point clouds into range images. We evaluate our method incrementally, both qualitatively and quantitatively, on the SemanticKITTI and SemanticPOSS datasets. The experimental results demonstrate that our proposed method achieves superior performance in 3D semantic segmentation and LiDAR mapping compared to the state-of-the-art.
automation & control systems,engineering, electrical & electronic, multidisciplinary,mathematics, interdisciplinary applications