Cascaded Detection Framework Based on a Novel Backbone Network and Feature Fusion

Zhuangzhuang Tian,Wei Wang,Ronghui Zhan,Zhiqiang He,Jun Zhang,Zhaowen Zhuang
DOI: https://doi.org/10.1109/jstars.2019.2924086
IF: 4.715
2019-01-01
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Abstract:Due to the ability of powerful feature representation, deep-learning-based object detection has attracted considerable research attention, and many methods have been proposed for remote sensing images. However, there are still some problems that need to be addressed. In this paper, a novel and effective detection framework based on faster region-based convolutional neural network is designed. Specifically, first, in order to locate the boundaries of large objects and find the missing small objects, DetNet is incorporated into the detection framework as the backbone network. DetNet fixes the spatial resolution in deep layers and adopts dilated bottleneck with convolution projection to increase the divergence between input and output feature maps. Then, the proposed framework uses the backbone network to extract the scene features and region features simultaneously, which are both mapped to feature vectors and then fused together. The feature fusion operation can improve the feature representation of the generated region. Last, to improve the performance of localization, the cascade structure is adopted in the framework. The cascade structure has multiple phases and every phase has independent classifier and regressor. The results obtained from the previous phase are used as the regions of interest in the next phase. Therefore, the multiphase detector can increase the detection accuracy phase by phase. Comprehensive evaluations on a public ten-class object detection dataset demonstrate the effectiveness of the proposed framework. Moreover, ablation experiments are also implemented to show the respective influence of different parts of the framework on the performance improvement.
What problem does this paper attempt to address?