An Optimized Deep Neural Network Detecting Small and Narrow Rectangular Objects in Google Earth Images.

Shenlu Jiang,Wei Yao,Man Sing Wong,Gen Li,Zhonghua Hong,Tae-Yong Kuc,Xiaohua Tong
DOI: https://doi.org/10.1109/jstars.2020.2975606
IF: 4.715
2020-01-01
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Abstract:Object detection is an important task for rapidly localizing target objects using high-resolution satellite imagery (HRSI). Although deep learning has been shown an efficient means of detection, object detection in HRSI remains problematic due to variations in object scale and size. In this article, we present a novel deep neural network (DNN) that combines double-shot neural network with misplaced localization strategy that adapts to object detection tasks in satellite images. This novel architecture optimizes the localization of small and narrow rectangular objects, which frequently appear in HRSI images, without accuracy loss on other size and width/height ratio objects. This method outperforms other state-of-art methods. We evaluated our proposed method on the NWPU VHR-10 public dataset and a new benchmark dataset (seven classes of small and narrow rectangular objects, SNRO-7). The NWPU VHR-10 dataset built a dataset for multiclass object detection; however, most labels are assigned in normal size and width/height ratios. SNRO-7 focuses on multiscale and multisize object detection and includes many small-size and narrow rectangular objects. We also evaluated the accuracy difference on DNN training and testing between gray scale and RGB datasets. The results of the experiment on object detection reveal that the mean average precision (MaP) of our method is 82.6% in NWPU VHR-10 and 79.3% in SNRO-7, which exceeds the MaPs of other state-of-the-art object detection neural networks. The model trained with the RGB dataset can achieve similar accuracy (around 79.0% MIoU) testing in both RGB and gray scale datasets. When training the model by mixing RGB and gray scale datasets in different ratios, the accuracy in the RGB channel significantly decreases with increasing gray scale images, but this does not influence the accuracy in the gray scale dataset.
What problem does this paper attempt to address?