EfficientLPS: Efficient LiDAR Panoptic Segmentation

Kshitij Sirohi,Rohit Mohan,Daniel Büscher,Wolfram Burgard,Abhinav Valada
DOI: https://doi.org/10.48550/arXiv.2102.08009
2021-11-04
Abstract:Panoptic segmentation of point clouds is a crucial task that enables autonomous vehicles to comprehend their vicinity using their highly accurate and reliable LiDAR sensors. Existing top-down approaches tackle this problem by either combining independent task-specific networks or translating methods from the image domain ignoring the intricacies of LiDAR data and thus often resulting in sub-optimal performance. In this paper, we present the novel top-down Efficient LiDAR Panoptic Segmentation (EfficientLPS) architecture that addresses multiple challenges in segmenting LiDAR point clouds including distance-dependent sparsity, severe occlusions, large scale-variations, and re-projection errors. EfficientLPS comprises of a novel shared backbone that encodes with strengthened geometric transformation modeling capacity and aggregates semantically rich range-aware multi-scale features. It incorporates new scale-invariant semantic and instance segmentation heads along with the panoptic fusion module which is supervised by our proposed panoptic periphery loss function. Additionally, we formulate a regularized pseudo labeling framework to further improve the performance of EfficientLPS by training on unlabelled data. We benchmark our proposed model on two large-scale LiDAR datasets: nuScenes, for which we also provide ground truth annotations, and SemanticKITTI. Notably, EfficientLPS sets the new state-of-the-art on both these datasets.
Computer Vision and Pattern Recognition,Machine Learning,Robotics
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the challenges of using LiDAR point clouds for panoptic segmentation in autonomous vehicles. Specifically, existing methods have the following problems when dealing with LiDAR point clouds: 1. **Distance - dependent Sparsity**: The point density of LiDAR data varies at different distances, resulting in sparser objects in the distance, which brings difficulties to feature extraction. 2. **Severe Occlusions**: Objects in LiDAR point clouds are often occluded by other objects, making feature extraction and recognition more complicated. 3. **Large Scale Variations**: Due to projection onto a 2D plane, objects at different distances will have significant differences in scale, which affects the model's ability to capture multi - scale features. 4. **Re - projection Errors**: When re - projecting from the 2D projection domain back to the 3D space, problems such as inaccurate boundaries may be introduced, affecting the final segmentation effect. To address these challenges, the paper proposes a new top - down and efficient LiDAR panoptic segmentation architecture (EfficientLPS), which contains several key components: - **Shared Backbone**: It includes a novel Proximity Convolution Module (PCM) for enhancing the geometric transformation modeling ability and aggregating multi - scale semantically rich features. - **Range - aware Feature Pyramid Network (RFPN)**: By fusing the output of the Range Encoder Network (REN), it enhances the ability to distinguish objects at different distances. - **Scale - invariant Semantic and Instance Segmentation Heads**: They are used to generate semantic and instance segmentation results respectively. - **Panoptic Fusion Module**: It combines the outputs of the semantic and instance segmentation heads to generate the final panoptic segmentation result, and is supervised by the proposed Panoptic Periphery Loss Function to optimize the boundaries between foreground and background pixels. - **Regularized Pseudo Labeling Framework**: It uses unlabeled data to generate pseudo - labels to further improve the model performance. Through these innovations, EfficientLPS has achieved new state - of - the - art performance on two large - scale LiDAR datasets, nuScenes and SemanticKITTI.