LENet: Lightweight And Efficient LiDAR Semantic Segmentation Using Multi-Scale Convolution Attention

Ben Ding
2023-06-19
Abstract:LiDAR-based semantic segmentation is critical in the fields of robotics and autonomous driving as it provides a comprehensive understanding of the scene. This paper proposes a lightweight and efficient projection-based semantic segmentation network called LENet with an encoder-decoder structure for LiDAR-based semantic segmentation. The encoder is composed of a novel multi-scale convolutional attention (MSCA) module with varying receptive field sizes to capture features. The decoder employs an Interpolation And Convolution (IAC) mechanism utilizing bilinear interpolation for upsampling multi-resolution feature maps and integrating previous and current dimensional features through a single convolution layer. This approach significantly reduces the network's complexity while also improving its accuracy. Additionally, we introduce multiple auxiliary segmentation heads to further refine the network's accuracy. Extensive evaluations on publicly available datasets, including SemanticKITTI, SemanticPOSS, and nuScenes, show that our proposed method is lighter, more efficient, and robust compared to state-of-the-art semantic segmentation methods. Full implementation is available at <a class="link-external link-https" href="https://github.com/fengluodb/LENet" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to improve the efficiency and reduce the network complexity while maintaining high accuracy in LiDAR - based semantic segmentation in the fields of autonomous driving and robotics. Specifically, the paper proposes a lightweight and efficient projection - based semantic segmentation network LENet, aiming to solve the following points: 1. **High computational complexity**: Traditional point cloud processing methods (such as point - based methods and voxel - based methods) have high computational complexity when processing large - scale point cloud data and are difficult to achieve real - time processing. The paper significantly reduces the computational complexity of the network by proposing a new multi - scale convolutional attention module (MSCA) and an interpolation and convolution mechanism (IAC). 2. **Accuracy improvement**: Although existing methods have achieved good results in some scenarios, their performance in complex environments still needs to be improved. The paper further optimizes the accuracy of the network by introducing multiple auxiliary segmentation heads, thus achieving state - of - the - art performance on multiple public datasets. 3. **Real - time processing ability**: Autonomous driving and robotics applications require the system to be able to process sensor data in real - time. The LENet proposed in the paper can achieve a processing speed of 26 frames per second while maintaining high accuracy, meeting the requirements of real - time processing. 4. **Robustness**: Existing methods may perform poorly in the face of different lighting conditions and weather conditions. Since LiDAR data is not affected by these factors, the paper uses LiDAR point cloud data for semantic segmentation, improving the robustness of the system. In summary, the main goal of this paper is to develop an efficient, lightweight and accurate LiDAR semantic segmentation network to meet the real - time and accuracy requirements in autonomous driving and robotics applications.