Hardness-Aware Scene Synthesis for Semi-Supervised 3D Object Detection

Shuai Zeng,Wenzhao Zheng,Jiwen Lu,Haibin Yan
DOI: https://doi.org/10.1109/tmm.2024.3396297
IF: 7.3
2024-01-01
IEEE Transactions on Multimedia
Abstract:3D object detection aims to recover the 3D information of concerning objects and serves as the fundamental task of autonomous driving perception. Its performance greatly depends on the scale of labeled training data, yet it is costly to obtain high-quality annotations for point cloud data. This motivates the use of semi-supervised learning which can additionally exploit unlabeled data to further boost the performance. While 2D semi-supervised learning methods focus on generating pseudo-labels for unlabeled existing samples as supplements for training, the structural nature of 3D point cloud data facilitates the composition of objects and backgrounds to synthesize realistic scenes. Motivated by this, we propose a hardness-aware scene synthesis (HASS) method to generate adaptive synthetic scenes to improve the generalization of the detection models. We obtain pseudo-labels for unlabeled objects and generate diverse scenes with different compositions of objects and backgrounds. As the scene synthesis is sensitive to the quality of pseudo-labels, we further propose a hardness-aware strategy to reduce the effect of low-quality pseudo-labels. In addition, we maintain a dynamic pseudo- database to ensure the diversity and quality of synthetic scenes. Extensive experimental results on the widely used KITTI and Waymo datasets demonstrate the superiority of the proposed HASS method, which outperforms existing semi-supervised learning methods on 3D object detection. We also conducted a series of experiments to analyze the effectiveness of our method including pseudo-label quality analysis, the effect of different filtering and thresholding strategies, and ablations of each component.
What problem does this paper attempt to address?