Open-CRB: Towards Open World Active Learning for 3D Object Detection

Zhuoxiao Chen,Yadan Luo,Zixin Wang,Zijian Wang,Xin Yu,Zi Huang
2024-09-23
Abstract:LiDAR-based 3D object detection has recently seen significant advancements through active learning (AL), attaining satisfactory performance by training on a small fraction of strategically selected point clouds. However, in real-world deployments where streaming point clouds may include unknown or novel objects, the ability of current AL methods to capture such objects remains unexplored. This paper investigates a more practical and challenging research task: Open World Active Learning for 3D Object Detection (OWAL-3D), aimed at acquiring informative point clouds with new concepts. To tackle this challenge, we propose a simple yet effective strategy called Open Label Conciseness (OLC), which mines novel 3D objects with minimal annotation costs. Our empirical results show that OLC successfully adapts the 3D detection model to the open world scenario with just a single round of selection. Any generic AL policy can then be integrated with the proposed OLC to efficiently address the OWAL-3D problem. Based on this, we introduce the Open-CRB framework, which seamlessly integrates OLC with our preliminary AL method, CRB, designed specifically for 3D object detection. We develop a comprehensive codebase for easy reproducing and future research, supporting 15 baseline methods (\textit{i.e.}, active learning, out-of-distribution detection and open world detection), 2 types of modern 3D detectors (\textit{i.e.}, one-stage SECOND and two-stage PV-RCNN) and 3 benchmark 3D datasets (\textit{i.e.}, KITTI, nuScenes and Waymo). Extensive experiments evidence that the proposed Open-CRB demonstrates superiority and flexibility in recognizing both novel and known classes with very limited labeling costs, compared to state-of-the-art baselines. Source code is available at \url{<a class="link-external link-https" href="https://github.com/Luoyadan/CRB-active-3Ddet/tree/Open-CRB" rel="external noopener nofollow">this https URL</a>}.
Computer Vision and Pattern Recognition,Artificial Intelligence,Machine Learning
What problem does this paper attempt to address?
### Problem the Paper Attempts to Solve The paper aims to address a more practical and challenging research task: Open World Active Learning for 3D Object Detection (OWAL-3D). Specifically, the paper focuses on how to efficiently identify and label point cloud data containing unknown or new categories of objects through active learning methods in real-world deployments. Existing active learning methods typically assume that the test data shares the same category set as the training data, which is not always the case in practical applications, as real-world environments may contain new, unknown categories. Therefore, the paper proposes a new problem setting: Open World Active Learning for 3D Object Detection (OWAL-3D), with the goal of enabling 3D detection models to adapt to open-world environments and recognize both known and unknown categories of objects at minimal annotation cost. ### Solution To tackle this challenge, the paper proposes a simple yet effective strategy—Open Label Compactness (OLC). OLC estimates the likelihood of unknown object labels in each point cloud by aggregating the uncertainties of all predicted bounding boxes and calculates the entropy of the predicted label distribution containing unknown labels as the OLC score. This score balances the selection of known and unknown categories, ensuring that the selected point clouds contain both high-quality known labels and potential new concepts. Based on OLC, the paper further introduces the Open-CRB framework, which seamlessly integrates OLC with the preliminary active learning method CRB. The CRB method is specifically designed for 3D object detection and can efficiently acquire point clouds containing new knowledge in the initial selection rounds, thereby significantly improving the model's detection performance in open-world scenarios. ### Experimental Results Experimental results show that the Open-CRB framework performs excellently on multiple benchmark datasets, effectively identifying both known and unknown categories at very limited annotation costs. Particularly on the nuScenes dataset, using only 50k annotated 3D bounding boxes, Open-CRB achieves a 12.1% mAP improvement compared to the best baseline method. ### Conclusion By introducing the OLC and Open-CRB framework, the paper successfully addresses the active learning problem for 3D object detection in open-world scenarios, providing an efficient solution for complex real-world environments.