Semi-Supervised Object Detection: A Survey on Progress from CNN to Transformer

Tahira Shehzadi,Ifza,Didier Stricker,Muhammad Zeshan Afzal
2024-07-16
Abstract:The impressive advancements in semi-supervised learning have driven researchers to explore its potential in object detection tasks within the field of computer vision. Semi-Supervised Object Detection (SSOD) leverages a combination of a small labeled dataset and a larger, unlabeled dataset. This approach effectively reduces the dependence on large labeled datasets, which are often expensive and time-consuming to obtain. Initially, SSOD models encountered challenges in effectively leveraging unlabeled data and managing noise in generated pseudo-labels for unlabeled data. However, numerous recent advancements have addressed these issues, resulting in substantial improvements in SSOD performance. This paper presents a comprehensive review of 27 cutting-edge developments in SSOD methodologies, from Convolutional Neural Networks (CNNs) to Transformers. We delve into the core components of semi-supervised learning and its integration into object detection frameworks, covering data augmentation techniques, pseudo-labeling strategies, consistency regularization, and adversarial training methods. Furthermore, we conduct a comparative analysis of various SSOD models, evaluating their performance and architectural differences. We aim to ignite further research interest in overcoming existing challenges and exploring new directions in semi-supervised learning for object detection.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The paper is primarily dedicated to addressing issues in the field of Semi-Supervised Object Detection (SSOD). Specifically: 1. **Reducing dependence on large amounts of labeled data**: By combining a small amount of labeled data with a large amount of unlabeled data, the need for expensive and time-consuming labeled data is reduced. 2. **Improving SSOD performance**: A series of improvement methods are proposed to address the challenges faced by early SSOD models in utilizing unlabeled data and managing noise in generated pseudo-labels, thereby significantly enhancing SSOD performance. 3. **Reviewing the latest advancements**: The paper comprehensively reviews 27 cutting-edge SSOD methods from Convolutional Neural Networks (CNNs) to Transformer architectures, discussing their core components, technical strategies, and architectural differences. 4. **Promoting further research**: It aims to inspire more research interest in overcoming existing challenges and exploring new directions in SSOD. Through these efforts, the paper hopes to advance SSOD technology, enabling its broader application in fields such as autonomous driving, medical image analysis, agriculture, and manufacturing.