Are We Ready for Real-Time LiDAR Semantic Segmentation in Autonomous Driving?

Samir Abou Haidar,Alexandre Chariot,Mehdi Darouich,Cyril Joly,Jean-Emmanuel Deschaud
2024-10-11
Abstract:Within a perception framework for autonomous mobile and robotic systems, semantic analysis of 3D point clouds typically generated by LiDARs is key to numerous applications, such as object detection and recognition, and scene reconstruction. Scene semantic segmentation can be achieved by directly integrating 3D spatial data with specialized deep neural networks. Although this type of data provides rich geometric information regarding the surrounding environment, it also presents numerous challenges: its unstructured and sparse nature, its unpredictable size, and its demanding computational requirements. These characteristics hinder the real-time semantic analysis, particularly on resource-constrained hardware architectures that constitute the main computational components of numerous robotic applications. Therefore, in this paper, we investigate various 3D semantic segmentation methodologies and analyze their performance and capabilities for resource-constrained inference on embedded NVIDIA Jetson platforms. We evaluate them for a fair comparison through a standardized training protocol and data augmentations, providing benchmark results on the Jetson AGX Orin and AGX Xavier series for two large-scale outdoor datasets: SemanticKITTI and nuScenes.
Robotics,Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper attempts to address the challenge of achieving real-time LiDAR semantic segmentation on resource-constrained embedded hardware architectures, such as the NVIDIA Jetson platform. Specifically, the paper focuses on the following aspects: 1. **Characteristics of 3D Point Cloud Data**: 3D point cloud data is unstructured, sparse, and of unpredictable size, making real-time semantic analysis on resource-constrained hardware very challenging. 2. **Limitations of Computational Resources**: Computational resources in many robotic applications are limited by system size, power consumption, and heat dissipation, further increasing the difficulty of achieving real-time semantic segmentation. 3. **Limitations of Existing Methods**: While existing 3D semantic segmentation methods perform well on high-performance computing platforms, their performance and efficiency on embedded devices are often suboptimal. To address these issues, the paper evaluates and compares various 3D semantic segmentation methods, aiming to find solutions that can run efficiently on embedded platforms. Specifically, the paper: - **Selected Various 3D Neural Networks**: Including projection methods (e.g., SalsaNext), point-based methods (e.g., WaffleIron), and sparse convolution-based methods (e.g., MinkowskiUNet42 and SPVCNN), and created lightweight variants of each method. - **Designed Standardized Training Protocols and Data Augmentation**: Ensuring that all models are trained and tested under the same conditions to fairly compare their performance. - **Conducted Detailed Performance Evaluations on Jetson AGX Orin and AGX Xavier Platforms**: Including metrics such as inference time, preprocessing time, memory consumption, and power consumption. Through this research, the paper hopes to provide valuable references and guidance for achieving real-time 3D semantic segmentation on embedded platforms.