Exploiting Cross-scale Consistency for Object Detection in Aerial Images

Jingtao Xu,Yali Li,Shengjin Wang
DOI: https://doi.org/10.1109/cvidliccea56201.2022.9824076
2022-01-01
Abstract:It is challenging for convolution neural networks (CNN) to handle aerial images with extremely small objects. At each layer of the CNN, the limited receptive field leads to the contradiction between learning detailed features of the small objects and the global features of the large objects. Moreover, it is difficult to learn the feature representation for small objects occupying quite a few pixels. In this letter, we exploit the cross-scale consistency to enhance the feature representation of the objects at variant scales. We devise a double-stream training pipeline to capture the correspondence between objects at different scales and improve the adaptation of receptive field to scale variation. To explicitly formulate the model adaptability to object scales, we propose a novel scale consistency loss to learn the scale-invariant feature representation for object detection. To verify the effectiveness of our method, extensive experiments are conducted on the VisDrone-DET dataset which has large variance in object scale and quite small objects. Our method achieves state-of-the-art performance, outperforming existing methods by about 2.50%.
What problem does this paper attempt to address?