Abstract:Training with sparse annotations is known to reduce the performance of object detectors. Previous methods have focused on proxies for missing ground truth annotations in the form of pseudo-labels for unlabeled boxes. We observe that existing methods suffer at higher levels of sparsity in the data due to noisy pseudo-labels. To prevent this, we propose an end-to-end system that learns to separate the proposals into labeled and unlabeled regions using Pseudo-positive mining. While the labeled regions are processed as usual, self-supervised learning is used to process the unlabeled regions thereby preventing the negative effects of noisy pseudo-labels. This novel approach has multiple advantages such as improved robustness to higher sparsity when compared to existing methods. We conduct exhaustive experiments on five splits on the PASCAL-VOC and COCO datasets achieving state-of-the-art performance. We also unify various splits used across literature for this task and present a standardized benchmark. On average, we improve by $2.6$, $3.9$ and $9.6$ mAP over previous state-of-the-art methods on three splits of increasing sparsity on COCO. Our project is publicly available at <a class="link-external link-https" href="https://www.cs.umd.edu/~sakshams/SparseDet" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

### Problems the Paper Attempts to Solve The paper "SparseDet: Improving Object Detection with Sparse Annotations via Pseudo-Positive Mining" aims to address the issue of sparsely annotated object detection (SAOD). Specifically, when the annotations in the training data are incomplete, the performance of object detectors significantly degrades. Existing methods typically generate pseudo-labels to compensate for the missing annotations, but the quality of these pseudo-labels is poor in highly sparse data, leading to suboptimal performance. ### Main Contributions 1. **Proposing the SparseDet Framework**: - SparseDet is an end-to-end SAOD framework that can categorize region proposals into annotated, unannotated, and background, and process them separately. - It handles unannotated regions through self-supervised learning, avoiding penalizing the classifier due to incorrect negative samples. 2. **Achieving State-of-the-Art Performance on Multiple Benchmarks**: - Extensive experiments were conducted on five different splits of the COCO and PASCAL-VOC datasets, with an average improvement of 2.6, 3.9, and 9.6 mAP on three different sparsity levels of COCO splits. 3. **Standardizing Evaluation Settings**: - The evaluation standards for SAOD methods were unified, and a new benchmark was proposed to assess semi-supervised learning capabilities, i.e., improving performance using unannotated data. - All SAOD split data were made public to facilitate future research replication and comparison. ### Method Overview 1. **Feature Extraction**: - A backbone network is used to extract features from the original image and its augmented versions. 2. **Common Region Proposal Network (C-RPN)**: - Features from two views are concatenated to generate region proposals. - C-RPN is trained using binary cross-entropy loss and smooth L1 loss. 3. **Pseudo-Positive Mining (PPM)**: - Region proposals are categorized into annotated, unannotated, and background by setting thresholds. - Misjudgments due to missing annotations are avoided. 4. **Loss Function**: - Supervised loss is used for annotated and background regions. - Self-supervised loss is used for unannotated regions to ensure feature consistency. ### Experimental Results - **COCO Dataset**: - SparseDet outperforms existing methods across different sparsity levels, especially in high sparsity scenarios. - For example, at 70% sparsity in Split-1, SparseDet improves by 7.75 mAP over Co-mining. - **PASCAL-VOC Dataset**: - SparseDet also performs excellently across different splits, particularly in high sparsity scenarios. - For instance, in the extreme setting of Split-4, SparseDet improves by 5.99 percentage points over Co-mining. ### Conclusion This paper effectively addresses the issue of sparsely annotated object detection by proposing the SparseDet framework, which performs exceptionally well in high sparsity data. Additionally, the authors standardized the evaluation settings, providing an important reference for future research.

SparseDet: Improving Sparsely Annotated Object Detection with Pseudo-positive Mining

SSF: Sparse Point Cloud Object Detection Based on Self-Adaptive Voxel Encoding and Focal-Sparse Convolution

SparseDet: A Simple and Effective Framework for Fully Sparse LiDAR-Based 3-D Object Detection

SparseDet: A Simple and Effective Framework for Fully Sparse LiDAR-based 3D Object Detection

SparseDet: Towards End-to-End 3D Object Detection

Sparse Generation: Making Pseudo Labels Sparse for Point Weakly Supervised Object Detection on Low Data Volume

DeNet: Scalable Real-Time Object Detection with Directed Sparse Sampling

Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection

CISO: Co-iteration Semi-Supervised Learning for Visual Object Detection

Adaptive Sparse Self-attention for Object Detection

Learning Coarse-To-Fine Sparselets For Efficient Object Detection And Scene Classification

Beyond Weakly Supervised: Pseudo Ground Truths Mining for Missing Bounding-Boxes Object Detection

Spatial Self-Distillation for Object Detection with Inaccurate Bounding Boxes

Sparse DETR: Efficient End-to-End Object Detection with Learnable Sparsity

Sparse R-CNN: End-to-End Object Detection with Learnable Proposals

MixTeacher: Mining Promising Labels with Mixed Scale Teacher for Semi-Supervised Object Detection

SP-Det: Leveraging Saliency Prediction for Voxel-Based 3D Object Detection in Sparse Point Cloud

Calibrated Teacher for Sparsely Annotated Object Detection

Are Dense Labels Always Necessary for 3D Object Detection from Point Cloud?

PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised Object Detection

SSC3OD: Sparsely Supervised Collaborative 3D Object Detection from LiDAR Point Clouds