Abstract:Weakly supervised 3D object detection for autonomous driving primarily focuses on cars because of their distinct rectangle boundaries and abundant instances. However, detecting categories with ambiguous rectangle boundaries and fewer instances than cars, such as pedestrians and cyclists, remains challenging with limited research. Ambiguity in rectangle boundaries presents significant difficulties in generating accurate 3D pseudo labels, while the scarcity of instances often leads to convergence issues during detector training. Pedestrians and cyclists are dense inside the 3D bounding boxes but sparse at corners and boundaries. Density is a practical clue to locate and discriminate pedestrians and cyclists in point clouds. This paper proposes a density-based 3D pseudo-label generation module(DPL-3D), addressing the challenges of ambiguous rectangle boundaries. Ambiguity rectangle boundaries will lead to poor pseudo-label quality. Therefore, By leveraging the density information of 3D points, our DPL-3D improves the accuracy and localization quality of the generated pseudo labels. It effectively segments background points, improving the estimation of pseudo labels’ location, dimension, and orientation. Few training samples always lead to local optima. Introducing multi-modal data in the detector network could enhance the constraints of objects’ features, but 2D images and 3D point clouds have a resolution gap. A motivation for dealing with the resolution gap is that neighboring regions with similar colors and textures in 2D images may exhibit spatial proximity in 3D space. Therefore, a multi-modal network driven by superpixel segmentation is introduced. This network enables effective discrimination between objects in 2D images and 3D point clouds, bridging the resolution gap and leveraging complementary features from both modalities. Experimental results on the KITTI dataset demonstrate the effectiveness of the proposed methods in addressing the challenges associated with weakly-supervised 3D object detection, particularly for categories with ambiguous rectangle boundaries and few instances.

Eliminating Spatial Ambiguity for Weakly Supervised 3D Object Detection Without Spatial Labels

Spatial Likelihood Voting with Self-Knowledge Distillation for Weakly Supervised Object Detection.

Weakly Supervised Monocular 3D Object Detection by Spatial-Temporal View Consistency

Weakly Supervised 3D Object Detection from Point Clouds

Weakly Supervised Monocular 3D Object Detection Using Multi-View Projection and Direction Consistency

Weakly Supervised 3D Object Detection via Multi-Level Visual Guidance

SPV-SSD: An Anchor-Free 3D Single-Stage Detector with Supervised-PointRendering and Visibility Representation

Towards A Weakly Supervised Framework for 3D Point Cloud Object Detection and Annotation

Point-DETR3D: Leveraging Imagery Data with Spatial Point Prior for Weakly Semi-supervised 3D Object Detection

General Geometry-aware Weakly Supervised 3D Object Detection

WeakM3D: Towards Weakly Supervised Monocular 3D Object Detection

Back to Reality: Learning Data-Efficient 3D Object Detector with Shape Guidance.

Back to Reality: Weakly-supervised 3D Object Detection with Shape-guided Label Enhancement

SS3D: Sparsely-Supervised 3D Object Detection from Point Cloud

A weakly supervised method for 3D object detection with partially annotated samples

Enhancing Pseudo Label Quality for Pedestrian and Cyclist in Weakly Supervised 3D Object Detection

VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection

Weakly Supervised Monocular 3D Detection with a Single-View Image

Are Dense Labels Always Necessary for 3D Object Detection from Point Cloud?

Exploiting Label Uncertainty for Enhanced 3D Object Detection From Point Clouds

3D Guided Weakly Supervised Semantic Segmentation