Learning Temporal Variations for 4D Point Cloud Segmentation

Hanyu Shi,Jiacheng Wei,Hao Wang,Fayao Liu,Guosheng Lin
DOI: https://doi.org/10.1007/s11263-024-02149-w
IF: 13.369
2024-06-23
International Journal of Computer Vision
Abstract:LiDAR-based 3D scene perception is a fundamental and important task for autonomous driving. Most state-of-the-art methods on LiDAR-based 3D recognition tasks focus on single-frame 3D point cloud data, ignoring temporal information. We argue that the temporal information across the frames provides crucial knowledge for 3D scene perceptions, especially in the driving scenario. In this paper, we focus on spatial and temporal variations to better explore temporal information across 3D frames. We design a temporal variation-aware interpolation module and a temporal voxel-point refinement module to capture the temporal variation in the 4D point cloud. The temporal variation-aware interpolation generates local features from the previous and current frames by capturing spatial coherence and temporal variation information. The temporal voxel-point refinement module builds a temporal graph on the 3D point cloud sequences and captures the temporal variation with a graph convolution module, transforming coarse voxel-level predictions into fine point-level predictions. With our proposed modules, we achieve superior performances on SemanticKITTI, SemantiPOSS and NuScenes.
computer science, artificial intelligence
What problem does this paper attempt to address?