A Coarse to Fine Network for Fast and Accurate Object Detection in High‐resolution Images

Yaguang Guo,Qi Zou,Lu Jin
DOI: https://doi.org/10.1049/cvi2.12042
IF: 1.484
2021-01-01
IET Computer Vision
Abstract:Because of the popularisation of high-resolution images, detecting objects in these images quickly and accurately has attracted increasing attention in recent studies. Current convolutional neural networks (CNN)-based detection methods have limitations in detecting small objects owing to the interference of scale variation. In this work, we propose an improved generic framework based on YOLOv3. Equipped with multiresolution supervision for training and multiresolution aggregation for inference, this method can deal with the challenge of scale variation in high-resolution images. At first, we move up the multiscale prediction position and add a dilated convolution module on YOLOv3 to improve the accuracy of detection, especially for small objects. Then, we present a coarse to fine method to reduce the detection time. Experiments on a COCO dataset show that our approach achieves 2.8% better accuracy compared with the previous YOLOv3. On a Dataset for Object deTection in Aerial images dataset (a high-resolution remote sensing dataset), our approach outperformed the YOLOv3 by nearly three percentage points in mean average precision. Moreover, it is up to three times faster as well and two times smaller than the previous YOLOv3.
What problem does this paper attempt to address?