Semantic Annotation of High-Resolution Satellite Images Via Weakly Supervised Learning.

Xiwen Yao,Junwei Han,Gong Cheng,Xueming Qian,Lei Guo
DOI: https://doi.org/10.1109/tgrs.2016.2523563
IF: 8.2
2016-01-01
IEEE Transactions on Geoscience and Remote Sensing
Abstract:In this paper, we focus on tackling the problem of automatic semantic annotation of high resolution (HR) optical satellite images, which aims to assign one or several predefined semantic concepts to an image according to its content. The main challenges arise from the difficulty of characterizing complex and ambiguous contents of the satellite images and the high human labor cost caused by preparing a large amount of training examples with high-quality pixel-level labels in fully supervised annotation methods. To address these challenges, we propose a unified annotation framework by combining discriminative high-level feature learning and weakly supervised feature transferring. Specifically, an efficient stacked discriminative sparse autoencoder (SDSAE) is first proposed to learn high-level features on an auxiliary satellite image data set for the land-use classification task. Inspired by the motivation that the encoder of the prelearned SDSAE can be regarded as a generic high-level feature extractor for HR optical satellite images, we then transfer the learned high-level features to semantic annotation. To compensate the difference between the auxiliary data set and the annotation data set, the transferred high-level features are further fine-tuned in a weakly supervised scheme by using the tile-level annotated training data. Finally, the fine-tuning process is formulated as an ultimate optimization problem, which can be solved efficiently with our proposed alternate iterative optimization method. Comprehensive experiments on a publicly available land-use classification data set and an annotation data set demonstrate the superiority of our SDSAE-based high-level feature learning method and the effectiveness of our weakly supervised semantic annotation framework compared with state-of-the-art fully supervised annotation methods.
What problem does this paper attempt to address?