Revisiting Cross-Domain Problem for LiDAR-based 3D Object Detection

Ruixiao Zhang,Juheon Lee,Xiaohao Cai,Adam Prugel-Bennett
2024-08-23
Abstract:Deep learning models such as convolutional neural networks and transformers have been widely applied to solve 3D object detection problems in the domain of autonomous driving. While existing models have achieved outstanding performance on most open benchmarks, the generalization ability of these deep networks is still in doubt. To adapt models to other domains including different cities, countries, and weather, retraining with the target domain data is currently necessary, which hinders the wide application of autonomous driving. In this paper, we deeply analyze the cross-domain performance of the state-of-the-art models. We observe that most models will overfit the training domains and it is challenging to adapt them to other domains directly. Existing domain adaptation methods for 3D object detection problems are actually shifting the models' knowledge domain instead of improving their generalization ability. We then propose additional evaluation metrics -- the side-view and front-view AP -- to better analyze the core issues of the methods' heavy drops in accuracy levels. By using the proposed metrics and further evaluating the cross-domain performance in each dimension, we conclude that the overfitting problem happens more obviously on the front-view surface and the width dimension which usually faces the sensor and has more 3D points surrounding it. Meanwhile, our experiments indicate that the density of the point cloud data also significantly influences the models' cross-domain performance.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper primarily explores the performance issues of LiDAR-based 3D object detection in cross-domain tasks. Specifically, the authors focus on how to enable models to generalize better across different datasets (such as different cities, countries, or weather conditions) without retraining. Although existing models perform excellently on specific datasets, they often require retraining when faced with new datasets, which limits the practical application of autonomous driving technology. By analyzing several state-of-the-art models (including those based on CNN and Transformer architectures), the authors found that their performance significantly drops in cross-domain tasks. Further research revealed that models tend to overfit the source dataset, and existing domain adaptation methods are actually transferring the knowledge domain of the model rather than improving its generalization ability. Additionally, the authors proposed a new evaluation metric—Side and Front Average Precision (AP)—to better understand the performance differences of models across different dimensions. Experimental results show that the prediction accuracy in the length direction is higher than in the width direction, but this is due to the length usually being greater than the width, resulting in similar absolute errors. This indicates that the inadequacy of existing models in cross-domain detection is not directly caused by the incompleteness of point cloud data but is more related to model structure and training strategies. Therefore, the authors suggest that future research should focus more on evaluation methods in cross-domain tasks to more comprehensively understand model performance.