CMT: Co-training Mean-Teacher for Unsupervised Domain Adaptation on 3D Object Detection

Shijie Chen,Junbao Zhuo,Xin Li,Haizhuang Liu,Rongquan Wang,Jiansheng Chen,Huimin Ma
DOI: https://doi.org/10.1145/3664647.3681558
2024-01-01
Abstract:LiDAR-based 3D detection, as an essential technique in multimedia applications such as augmented reality and autonomous driving, has made great progress in recent years. However, the performance of a well trained 3D detector is considerably graded when deployed in unseen environments due to the severe domain gap. Traditional unsupervised domain adaptation methods, including co-training and mean-teacher frameworks, do not effectively bridge the domain gap as they struggle with noisy and incomplete pseudo-labels and the inability to capture domain-invariant features. In this work, we introduce a novel Co-training Mean-Teacher (CMT) framework for unsupervised domain adaptation in 3D object detection. Our framework enhances adaptation by leveraging both source and target domain data to construct a hybrid domain that aligns domain-specific features more effectively. We employ hard instance mining to enrich the target domain feature distribution and utilize class-aware contrastive learning to refine feature representations across domains. Additionally, we develop batch adaptive normalization to fine-tune the batch normalization parameters of the teacher model dynamically, promoting more stable and reliable learning. Extensive experiments across various benchmarks, including Waymo, nuScenes and KITTI, demonstrate the superiority of our CMT over existing state-of-the-art approaches in different adaptation scenarios. Codes are available at https://github.com/csj777/CMT.
What problem does this paper attempt to address?