ALPI: Auto-Labeller with Proxy Injection for 3D Object Detection using 2D Labels Only

Saad Lahlali,Nicolas Granger,Hervé Le Borgne,Quoc-Cuong Pham

2024-07-24

Abstract:3D object detection plays a crucial role in various applications such as autonomous vehicles, robotics and augmented reality. However, training 3D detectors requires a costly precise annotation, which is a hindrance to scaling annotation to large datasets. To address this challenge, we propose a weakly supervised 3D annotator that relies solely on 2D bounding box annotations from images, along with size priors. One major problem is that supervising a 3D detection model using only 2D boxes is not reliable due to ambiguities between different 3D poses and their identical 2D projection. We introduce a simple yet effective and generic solution: we build 3D proxy objects with annotations by construction and add them to the training dataset. Our method requires only size priors to adapt to new classes. To better align 2D supervision with 3D detection, our method ensures depth invariance with a novel expression of the 2D losses. Finally, to detect more challenging instances, our annotator follows an offline pseudo-labelling scheme which gradually improves its 3D pseudo-labels. Extensive experiments on the KITTI dataset demonstrate that our method not only performs on-par or above previous works on the Car category, but also achieves performance close to fully supervised methods on more challenging classes. We further demonstrate the effectiveness and robustness of our method by being the first to experiment on the more challenging nuScenes dataset. We additionally propose a setting where weak labels are obtained from a 2D detector pre-trained on MS-COCO instead of human annotations.

Computer Vision and Pattern Recognition,Artificial Intelligence

What problem does this paper attempt to address?

The problem that this paper attempts to solve is how to use only 2D bounding box annotations to train the model in 3D object detection tasks, thereby avoiding the high cost of 3D annotation. Specifically, the paper proposes a method named ALPI. This method can generate 3D pseudo - labels by injecting proxy objects without any 3D annotations, and then train the 3D detector. This method solves two main limitations in existing weakly - supervised 3D detection methods: 1. **Multi - class adaptability**: Most existing methods are either semi - weakly - supervised and require a small amount of 3D annotations, or are specific to a certain class of objects (such as cars) and are difficult to extend to other classes. 2. **Completely without 3D annotations**: The ALPI method does not require 3D annotations at all, and only needs 2D bounding boxes and class - size priors to train the model. Through these improvements, ALPI not only achieves performance comparable to or better than previous methods on the car class in the KITTI dataset, but also achieves performance close to that of fully - supervised methods on the pedestrian and bicycle classes. In addition, this method is also experimented on the more challenging nuScenes dataset for the first time, further verifying its effectiveness and generalization ability.

ALPI: Auto-Labeller with Proxy Injection for 3D Object Detection using 2D Labels Only

Towards A Weakly Supervised Framework for 3D Point Cloud Object Detection and Annotation

Semi-Supervised 3d Object Detection Via Adaptive Pseudo-Labeling

Autolabeling 3D Objects With Differentiable Rendering of SDF Shape Priors

An Empirical Study of Pseudo-Labeling for Image-based 3D Object Detection

View-to-Label: Multi-View Consistency for Self-Supervised 3D Object Detection

Are Dense Labels Always Necessary for 3D Object Detection from Point Cloud?

A weakly supervised method for 3D object detection with partially annotated samples

Segment, Lift and Fit: Automatic 3D Shape Labeling from 2D Prompts

ST3D++: Denoised Self-Training for Unsupervised Domain Adaptation on 3D Object Detection

Back to Reality: Learning Data-Efficient 3D Object Detector with Shape Guidance.

Weakly Supervised 3D Object Detection via Multi-Level Visual Guidance

Shelf-Supervised Cross-Modal Pre-Training for 3D Object Detection

Enhancing Pseudo Label Quality for Pedestrian and Cyclist in Weakly Supervised 3D Object Detection

SS3D: Sparsely-Supervised 3D Object Detection from Point Cloud

General Geometry-aware Weakly Supervised 3D Object Detection

PE-MCAT: Leveraging Image Sensor Fusion and Adaptive Thresholds for Semi-Supervised 3D Object Detection

STAL3D: Unsupervised Domain Adaptation for 3D Object Detection via Collaborating Self-Training and Adversarial Learning

Eliminating Spatial Ambiguity for Weakly Supervised 3D Object Detection Without Spatial Labels

Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object Detection

Towards 3D Object Detection with 2D Supervision