Prototype-Voxel Contrastive Learning for LiDAR Point Cloud Panoptic Segmentation

Minzhe Liu,Qiang Zhou,Hengshuang Zhao,Jianing Li,Yuan Du,Kurt Keutzer,Li Du,Shanghang Zhang
DOI: https://doi.org/10.1109/icra46639.2022.9811638
2022-01-01
Abstract:LiDAR point cloud panoptic segmentation, including both semantic and instance segmentation, plays a critical role in meticulous scene understanding for autonomous driving. Existing 3D voxelized approaches either utilize 3D sparse convolution that only focuses on local scene understanding, or add extra and time-consuming PointNet branch to capture global feature structures. To address these limitations, we propose an end-to-end Prototype-Voxel Contrastive Learning (PVCL) framework for learning stable and discriminative semantic representations, which includes voxel-level and prototype-level contrastive learning (CL). The voxel-level CL decreases intra-class distance and increases inter-class distance among sample representations, while the prototype-level CL further reduces the dependence of CL on negative sampling and avoids the influence of outliers from the same class, enabling PVCL to be more effective for outdoor point cloud panoptic segmentation. Extensive experiments are conducted on the public point cloud panoptic segmentation datasets, Semantic-KITTI and nuScenes, where evaluations and ablation studies demonstrate PVCL achieves superior performance compared with the state-of-the-art. Our approach ranks the top on the public leaderboard of Semantic-KITTI at the time of submission, and surpasses the published 2nd rank, EfficientLPS, by 1.7% in PQ.
What problem does this paper attempt to address?