SparseDet: Improving Sparsely Annotated Object Detection with Pseudo-positive Mining

Saksham Suri,Sai Saketh Rambhatla,Rama Chellappa,Abhinav Shrivastava
2023-08-27
Abstract:Training with sparse annotations is known to reduce the performance of object detectors. Previous methods have focused on proxies for missing ground truth annotations in the form of pseudo-labels for unlabeled boxes. We observe that existing methods suffer at higher levels of sparsity in the data due to noisy pseudo-labels. To prevent this, we propose an end-to-end system that learns to separate the proposals into labeled and unlabeled regions using Pseudo-positive mining. While the labeled regions are processed as usual, self-supervised learning is used to process the unlabeled regions thereby preventing the negative effects of noisy pseudo-labels. This novel approach has multiple advantages such as improved robustness to higher sparsity when compared to existing methods. We conduct exhaustive experiments on five splits on the PASCAL-VOC and COCO datasets achieving state-of-the-art performance. We also unify various splits used across literature for this task and present a standardized benchmark. On average, we improve by $2.6$, $3.9$ and $9.6$ mAP over previous state-of-the-art methods on three splits of increasing sparsity on COCO. Our project is publicly available at <a class="link-external link-https" href="https://www.cs.umd.edu/~sakshams/SparseDet" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems the Paper Attempts to Solve The paper "SparseDet: Improving Object Detection with Sparse Annotations via Pseudo-Positive Mining" aims to address the issue of sparsely annotated object detection (SAOD). Specifically, when the annotations in the training data are incomplete, the performance of object detectors significantly degrades. Existing methods typically generate pseudo-labels to compensate for the missing annotations, but the quality of these pseudo-labels is poor in highly sparse data, leading to suboptimal performance. ### Main Contributions 1. **Proposing the SparseDet Framework**: - SparseDet is an end-to-end SAOD framework that can categorize region proposals into annotated, unannotated, and background, and process them separately. - It handles unannotated regions through self-supervised learning, avoiding penalizing the classifier due to incorrect negative samples. 2. **Achieving State-of-the-Art Performance on Multiple Benchmarks**: - Extensive experiments were conducted on five different splits of the COCO and PASCAL-VOC datasets, with an average improvement of 2.6, 3.9, and 9.6 mAP on three different sparsity levels of COCO splits. 3. **Standardizing Evaluation Settings**: - The evaluation standards for SAOD methods were unified, and a new benchmark was proposed to assess semi-supervised learning capabilities, i.e., improving performance using unannotated data. - All SAOD split data were made public to facilitate future research replication and comparison. ### Method Overview 1. **Feature Extraction**: - A backbone network is used to extract features from the original image and its augmented versions. 2. **Common Region Proposal Network (C-RPN)**: - Features from two views are concatenated to generate region proposals. - C-RPN is trained using binary cross-entropy loss and smooth L1 loss. 3. **Pseudo-Positive Mining (PPM)**: - Region proposals are categorized into annotated, unannotated, and background by setting thresholds. - Misjudgments due to missing annotations are avoided. 4. **Loss Function**: - Supervised loss is used for annotated and background regions. - Self-supervised loss is used for unannotated regions to ensure feature consistency. ### Experimental Results - **COCO Dataset**: - SparseDet outperforms existing methods across different sparsity levels, especially in high sparsity scenarios. - For example, at 70% sparsity in Split-1, SparseDet improves by 7.75 mAP over Co-mining. - **PASCAL-VOC Dataset**: - SparseDet also performs excellently across different splits, particularly in high sparsity scenarios. - For instance, in the extreme setting of Split-4, SparseDet improves by 5.99 percentage points over Co-mining. ### Conclusion This paper effectively addresses the issue of sparsely annotated object detection by proposing the SparseDet framework, which performs exceptionally well in high sparsity data. Additionally, the authors standardized the evaluation settings, providing an important reference for future research.