Enabling Near-Zero Cost Object Detection in Remote Sensing Imagery Via Progressive Self-Training
Xiang Zhang,Xiangteng Jiang,Qiyao Hu,Hangzai Luo,Sheng Zhong,Lei Tang,Jinye Peng,Jianping Fan
DOI: https://doi.org/10.1109/tgrs.2024.3415002
IF: 8.2
2024-01-01
IEEE Transactions on Geoscience and Remote Sensing
Abstract:Deep learning-based object detection models rely heavily on large-scale and precise annotations for training. However, manually annotating bounding-box annotations for such data is both time-consuming and costly, especially when dealing with high-resolution satellite imagery containing densely packed small-sized objects. To alleviate the burden of manual annotation, we propose a simple yet effective approach, called Progressive Self-Training Object Detection (PSTDet), to enable accurate object detection in remote sensing imagery without relying on manual annotations. Our PSTDet framework consists of two main components: Initial Pseudo Label Generation (IPLG) and Progressive Self-training with Re-Labeling (PST-R). In IPLG, we leverage unsupervised image clustering, unsupervised instance detection, and geometric constraints to automatically generate high-quality bounding-box annotations for the initial training dataset. This innovative approach significantly reduces the time and expense associated with data annotation, laying a solid foundation for the subsequent progressive self-training stage. The annotations produced by IPLG serve as the training data for PST-R, which enhances the detector and pseudo labels through progressive self-training and our proposed Noisy Pseudo-Label Filtering strategy (NPLFilter). Our NPLFilter purifies the quality of pseudo labels by integrating geometric constraints, prior knowledge, and category-adaptive thresholds. Experimental results demonstrate that our method achieves significant performance improvement on challenging NWPU VHR-10.v2 and DIOR datasets. Notably, our method far outperforms state-of-the-art weakly-supervised methods and compares favorably with fully-supervised methods.
What problem does this paper attempt to address?