Focus-and-Detect: A Small Object Detection Framework for Aerial Images

Onur Can Koyun,Reyhan Kevser Keser,İbrahim Batuhan Akkaya,Behçet Uğur Töreyin
DOI: https://doi.org/10.1016/j.image.2022.116675
2022-03-24
Abstract:Despite recent advances, object detection in aerial images is still a challenging task. Specific problems in aerial images makes the detection problem harder, such as small objects, densely packed objects, objects in different sizes and with different orientations. To address small object detection problem, we propose a two-stage object detection framework called "Focus-and-Detect". The first stage which consists of an object detector network supervised by a Gaussian Mixture Model, generates clusters of objects constituting the focused regions. The second stage, which is also an object detector network, predicts objects within the focal regions. Incomplete Box Suppression (IBS) method is also proposed to overcome the truncation effect of region search approach. Results indicate that the proposed two-stage framework achieves an AP score of 42.06 on VisDrone validation dataset, surpassing all other state-of-the-art small object detection methods reported in the literature, to the best of authors' knowledge.
Computer Vision and Pattern Recognition,Artificial Intelligence
What problem does this paper attempt to address?
### The Problem the Paper Attempts to Solve This paper aims to address the issue of small object detection in aerial images. Despite significant progress in object detection in recent years, detecting objects in aerial images remains a challenging task. Specifically, there are the following types of problems in aerial images: 1. **Small Objects**: Objects in aerial images are usually small, making detection difficult. 2. **Densely Arranged Objects**: Objects in aerial images may be densely packed together, increasing the complexity of detection. 3. **Objects of Different Sizes and Orientations**: Objects in aerial images vary greatly in size and orientation, further increasing the difficulty of detection. To tackle these challenges, the authors propose a two-stage object detection framework called "Focus-and-Detect" (F&D). The main contributions of this framework are as follows: - **Focus-and-Detect Framework**: This framework is based on a region search method and is divided into two stages. The first stage generates focused regions containing objects through an object detection network supervised by a Gaussian Mixture Model. The second stage performs object detection within these focused regions. - **Gaussian Mixture Model for Object Clustering**: A Gaussian Mixture Model is used to generate object clusters and normalize the scale of the generated clusters. - **Incomplete Box Suppression (IBS) Method**: A new method is proposed to suppress incomplete bounding boxes caused by overlapping focused regions. Through these innovations, the F&D framework achieves an AP score of 42.06 on the VisDrone validation dataset and an AP@70 score of 54.16 on the UA VDT test dataset, surpassing all other state-of-the-art small object detection methods reported in the literature.