YOLC: You Only Look Clusters for Tiny Object Detection in Aerial Images

Chenguang Liu,Guangshuai Gao,Ziyue Huang,Zhenghui Hu,Qingjie Liu,Yunhong Wang
2024-06-17
Abstract:Detecting objects from aerial images poses significant challenges due to the following factors: 1) Aerial images typically have very large sizes, generally with millions or even hundreds of millions of pixels, while computational resources are limited. 2) Small object size leads to insufficient information for effective detection. 3) Non-uniform object distribution leads to computational resource wastage. To address these issues, we propose YOLC (You Only Look Clusters), an efficient and effective framework that builds on an anchor-free object detector, CenterNet. To overcome the challenges posed by large-scale images and non-uniform object distribution, we introduce a Local Scale Module (LSM) that adaptively searches cluster regions for zooming in for accurate detection. Additionally, we modify the regression loss using Gaussian Wasserstein distance (GWD) to obtain high-quality bounding boxes. Deformable convolution and refinement methods are employed in the detection head to enhance the detection of small objects. We perform extensive experiments on two aerial image datasets, including Visdrone2019 and UAVDT, to demonstrate the effectiveness and superiority of our proposed approach. Code is available at <a class="link-external link-https" href="https://github.com/dawn-ech/YOLC" rel="external noopener nofollow">this https URL</a>.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
### Problems Addressed by the Paper The paper primarily addresses the issue of small object detection in aerial images and proposes an efficient and effective framework called YOLC (You Only Look Clusters). Specifically, the paper aims to solve the following key problems: 1. **Large-scale Image Processing**: - Aerial images are usually very large in size (millions or even tens of millions of pixels), while the computational resources of current devices are limited. - Images need to be resized or divided into smaller chunks for detection. 2. **Insufficient Small Object Detection**: - Small objects occupy a large proportion of the image, but due to limitations in resolution and visual features, detectors find it difficult to effectively recognize these small objects. - Small objects are typically defined as those with an area smaller than 32×32 pixels. 3. **Non-uniform Distribution**: - Objects in aerial images are unevenly distributed, leading to background areas consuming a lot of computational resources while contributing very little. - Dense regions of objects should receive more attention, while background areas should be ignored. To address these issues, the paper proposes a new framework YOLC based on an anchor-free detector (CenterNet) and introduces a Local Scale Module (LSM) to adaptively search and scale cluster regions, improving detection accuracy and efficiency. Additionally, the paper improves the regression loss function (GWD+L1 loss) and uses deformable convolutions in the detection head to enhance small object detection. Extensive experiments on two public aerial image datasets (Visdrone2019 and UA VDT) demonstrate the effectiveness and superiority of the YOLC method.