Unified 3D and 4D Panoptic Segmentation via Dynamic Shifting Networks
Fangzhou Hong,Lingdong Kong,Hui Zhou,Xinge Zhu,Hongsheng Li,Ziwei Liu
DOI: https://doi.org/10.1109/tpami.2023.3349304
IF: 23.6
2024-01-01
IEEE Transactions on Pattern Analysis and Machine Intelligence
Abstract:With the rapid advances in autonomous driving, it becomes critical to equip its sensing system with more holistic 3D perception. However, widely explored tasks like 3D detection or point cloud semantic segmentation focus on parsing either the objects (e.g. cars and pedestrians) or scenes (e.g. trees and buildings). In this work, we propose to address the challenging task of LiDAR-based Panoptic Segmentation, which aims to parse both objects and scenes in a unified manner. In particular, we propose Dynamic Shifting Network (DS-Net), which serves as an effective panoptic segmentation framework in the point cloud realm. DS-Net features a dynamic shifting module for complex LiDAR point cloud distributions. We observe that commonly used clustering algorithms like BFS or DBSCAN are incapable of handling complex autonomous driving scenes with non-uniform point cloud distributions and varying instance sizes. Thus, we present an efficient learnable clustering module, dynamic shifting, which adapts kernel functions on the fly for different instances. To further explore the temporal information, we extend the single-scan processing framework to its temporal version, namely 4D-DS-Net, for the task of 4D Panoptic Segmentation, where the same instance across multiple frames should be given the same ID prediction. Instead of naïvely appending a tracking module to DS-Net, we propose to solve the 4D panoptic segmentation in a more unified way. Specifically, 4D-DS-Net first constructs 4D data volume by aligning consecutive LiDAR scans, upon which the temporally unified instance clustering is performed to obtain the final results. Extensive experiments on two large-scale autonomous driving LiDAR datasets, SemanticKITTI and Panoptic nuScenes, are conducted to demonstrate the effectiveness and superior performance of the proposed solution. The code is publicly available at https://github.com/hongfz16/DS-Net.
computer science, artificial intelligence,engineering, electrical & electronic