Abstract:With the rapid advances of autonomous driving, it becomes critical to equip its sensing system with more holistic 3D perception. However, existing works focus on parsing either the objects (e.g. cars and pedestrians) or scenes (e.g. trees and buildings) from the LiDAR sensor. In this work, we address the task of LiDAR-based panoptic segmentation, which aims to parse both objects and scenes in a unified manner. As one of the first endeavors towards this new challenging task, we propose the Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm. In particular, DS-Net has three appealing properties: 1) strong backbone design. DS-Net adopts the cylinder convolution that is specifically designed for LiDAR point clouds. The extracted features are shared by the semantic branch and the instance branch which operates in a bottom-up clustering style. 2) Dynamic Shifting for complex point distributions. We observe that commonly-used clustering algorithms like BFS or DBSCAN are incapable of handling complex autonomous driving scenes with non-uniform point cloud distributions and varying instance sizes. Thus, we present an efficient learnable clustering module, dynamic shifting, which adapts kernel functions on-the-fly for different instances. 3) Consensus-driven Fusion. Finally, consensus-driven fusion is used to deal with the disagreement between semantic and instance predictions. To comprehensively evaluate the performance of LiDAR-based panoptic segmentation, we construct and curate benchmarks from two large-scale autonomous driving LiDAR datasets, SemanticKITTI and nuScenes. Extensive experiments demonstrate that our proposed DS-Net achieves superior accuracies over current state-of-the-art methods. Notably, we achieve 1st place on the public leaderboard of SemanticKITTI, outperforming 2nd place by 2.6% in terms of the PQ metric.

Accurate 3D Semantic Segmentation of Point Clouds for Intelligent Vehicles Based on Multi-view Edge Guidance and Fusion

Unifying Terrain Awareness Through Real-Time Semantic Segmentation

Robust 3D Semantic Segmentation Method Based on Multi-Modal Collaborative Learning

Improved 3D Semantic Segmentation Model Based on RGB Image and LiDAR Point Cloud Fusion for Automantic Driving

RGB and LiDAR Fusion-based 3D Semantic Segmentation for Autonomous Driving

Uplifting Range-View-based 3D Semantic Segmentation in Real-Time with Multi-Sensor Fusion

Location-Guided LiDAR-Based Panoptic Segmentation for Autonomous Driving.

Ground-Aware Point Cloud Semantic Segmentation for Autonomous Driving

LiDAR-Based Real-Time Panoptic Segmentation via Spatiotemporal Sequential Data Fusion

A Multi-phase Camera-LiDAR Fusion Network for 3D Semantic Segmentation with Weak Supervision

Revisiting Multi-modal 3D Semantic Segmentation in Real-world Autonomous Driving

Multi-Scale Point-Wise Convolutional Neural Networks for 3D Object Segmentation From LiDAR Point Clouds in Large-Scale Environments

Combined Edge- and Stixel-based Object Detection in 3D Point Cloud

LiDAR-based Panoptic Segmentation via Dynamic Shifting Network

SegVoxelNet: Exploring Semantic Context and Depth-aware Features for 3D Vehicle Detection from Point Cloud

SIESEF-FusionNet: Spatial Inter-correlation Enhancement and Spatially-Embedded Feature Fusion Network for LiDAR Point Cloud Semantic Segmentation

LIF-Seg: LiDAR and Camera Image Fusion for 3D LiDAR Semantic Segmentation

VI-eye: semantic-based 3D point cloud registration for infrastructure-assisted autonomous driving

Efficient Spatial-Temporal Information Fusion for LiDAR-Based 3D Moving Object Segmentation

Real-Time Semantic Segmentation of LiDAR Point Clouds on Edge Devices for Unmanned Systems

RS-SLAM: Real time semantic slam with driverless car using LiDAR-Camera-IMU sensing