A multi-scale pyramid feature fusion-based object detection method for remote sensing images
Panpan Huangfu,Lanxue Dang,Panpan HuangfuLanxue Danga School of Information and Electronic engineering,Shangqiu Institute of Technology,Shangqiu,Chinab Henan Key Laboratory of Big Data Analysis and Processing,School of Computer and Information Engineering,Henan University,Kaifeng,China
DOI: https://doi.org/10.1080/01431161.2023.2288947
IF: 3.531
2023-12-12
International Journal of Remote Sensing
Abstract:Object detection is a basic and challenging task in remote sensing image analysis that has received extensive attention in recent years. Feature fusion is one of the key steps in object detection. Most existing methods of feature fusion first complete the preliminary fusion of feature maps of different scales through 'add' or 'concat' operations, followed by using a single-scale convolution to further improve the fusion effect. However, due to the fact that multi-level features exhibit multi-scale representations, the fusion effect of existing methods is limited. To improve the efficiency of feature fusion, we propose a multi-scale pyramid feature fusion network, which performs multi-scale learning through multi-scale convolution kernels to complete multi-level feature fusion more effectively. Then we propose a lightweight decoupled head, which alleviates the conflict between the classification task and the localization task. We conducted experiments on the dataset of object detection in aerial images (DOTA) dataset and the HRSC2016 dataset to verify our proposed methods. The results show that the performance of our proposed methods is better than other existing methods, with an mAP of 73.3%, 67.6%, 65.0%, and 96.7% on the DOTA1.0, DOTA1.5, DOTA2.0, and HRSC2016 datasets, respectively. Meanwhile, the parameter quantity of the proposed model is 10.3 M, and the inference time is 5.1 ms, which meets the requirement of lightweight and ensures the timeliness of detection.
imaging science & photographic technology,remote sensing