Abstract:With the rapid advances in autonomous driving, it becomes critical to equip its sensing system with more holistic 3D perception. However, widely explored tasks like 3D detection or point cloud semantic segmentation focus on parsing either the objects (e.g. cars and pedestrians) or scenes (e.g. trees and buildings). In this work, we propose to address the challenging task of LiDAR-based Panoptic Segmentation, which aims to parse both objects and scenes in a unified manner. In particular, we propose Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm. DS-Net features a dynamic shifting module for complex LiDAR point cloud distributions. We observe that commonly used clustering algorithms like BFS or DBSCAN are incapable of handling complex autonomous driving scenes with non-uniform point cloud distributions and varying instance sizes. Thus, we present an efficient learnable clustering module, dynamic shifting, which adapts kernel functions on the fly for different instances. To further explore the temporal information, we extend the single-scan processing framework to its temporal version, namely 4D-DS-Net, for the task of 4D Panoptic Segmentation, where the same instance across multiple frames should be given the same ID prediction. Instead of naïvely appending a tracking module to DS-Net, we propose to solve the 4D panoptic segmentation in a more unified way. Specifically, 4D-DS-Net first constructs 4D data volume by aligning consecutive LiDAR scans, upon which the temporally unified instance clustering is performed to obtain the final results. Extensive experiments on two large-scale autonomous driving LiDAR datasets, SemanticKITTI and Panoptic nuScenes, are conducted to demonstrate the effectiveness and superior performance of the proposed solution. The code is publicly available at https://github.com/hongfz16/DS-Net.

Learning Temporal Variations for 4D Point Cloud Segmentation

Learning Spatial and Temporal Variations for 4D Point Cloud Segmentation

Temporal Feature Matching and Propagation for Semantic Segmentation on 3D Point Cloud Sequences

LiDAR Video Object Segmentation with Dynamic Kernel Refinement

Pass3d: Precise And Accelerated Semantic Segmentation For 3d Point Cloud

3D Object Segmentation Using Cross-Window Point Transformer with Latent Semantic Boundary Guidance

Human Segmentation with Dynamic LiDAR Data

LiDAR-based 4D Panoptic Segmentation via Dynamic Shifting Network

SVQNet: Sparse Voxel-Adjacent Query Network for 4D Spatio-Temporal LiDAR Semantic Segmentation

4D-Former: Multimodal 4D Panoptic Segmentation

A Spatiotemporal Correspondence Approach to Unsupervised LiDAR Segmentation with Traffic Applications

TASeg: Temporal Aggregation Network for LiDAR Semantic Segmentation

Unified 3D and 4D Panoptic Segmentation via Dynamic Shifting Networks

Receding Moving Object Segmentation in 3D LiDAR Data Using Sparse 4D Convolutions

LiDAR-Based Real-Time Panoptic Segmentation via Spatiotemporal Sequential Data Fusion

Future Does Matter: Boosting 3D Object Detection with Temporal Motion Estimation in Point Cloud Sequences

Pseudo-LiDAR Point Cloud Interpolation Based on 3D Motion Representation and Spatial Supervision

Multi-Scale Point-Wise Convolutional Neural Networks for 3D Object Segmentation From LiDAR Point Clouds in Large-Scale Environments

Fast Context-Awareness Encoder for LiDAR Point Semantic Segmentation

MemorySeg: Online LiDAR Semantic Segmentation with a Latent Memory

SegNet4D: Efficient Instance-Aware 4D Semantic Segmentation for LiDAR Point Cloud