Building Extraction in Very High Resolution Imagery by Dense-Attention Networks

Hui Yang,Penghai Wu,Xuedong Yao,Yanlan Wu,Biao Wang,Yongyang Xu
DOI: https://doi.org/10.3390/rs10111768
IF: 5
2018-11-08
Remote Sensing
Abstract:Building extraction from very high resolution (VHR) imagery plays an important role in urban planning, disaster management, navigation, updating geographic databases, and several other geospatial applications. Compared with the traditional building extraction approaches, deep learning networks have recently shown outstanding performance in this task by using both high-level and low-level feature maps. However, it is difficult to utilize different level features rationally with the present deep learning networks. To tackle this problem, a novel network based on DenseNets and the attention mechanism was proposed, called the dense-attention network (DAN). The DAN contains an encoder part and a decoder part which are separately composed of lightweight DenseNets and a spatial attention fusion module. The proposed encoder–decoder architecture can strengthen feature propagation and effectively bring higher-level feature information to suppress the low-level feature and noises. Experimental results based on public international society for photogrammetry and remote sensing (ISPRS) datasets with only red–green–blue (RGB) images demonstrated that the proposed DAN achieved a higher score (96.16% overall accuracy (OA), 92.56% F1 score, 90.56% mean intersection over union (MIOU), less training and response time and higher-quality value) when compared with other deep learning methods.
environmental sciences,imaging science & photographic technology,remote sensing,geosciences, multidisciplinary
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the extraction of buildings in very high - resolution (VHR) remote - sensing images. Specifically, traditional building extraction methods face challenges when dealing with factors such as different scales, complex backgrounds (such as shadows, vegetation, water bodies, and non - building man - made features), roof heterogeneity, and rich topological appearances. These factors make the extraction of two - dimensional building contours from VHR images a rather complex task. In addition, although existing deep - learning methods have shown excellent performance, they still have difficulties in making rational use of features at different levels, especially in effectively combining high - level features and low - level features to improve the recognition accuracy of small - scale and complex buildings. To solve the above problems, the paper proposes a new network based on DenseNets and the attention mechanism - the Dense - Attention Network (DAN). This network aims to improve the accuracy of building extraction by enhancing feature propagation and effectively using high - level features to suppress low - level features and noise. Experimental results show that, compared with other deep - learning methods, the proposed DAN has achieved a higher overall accuracy (96.16% OA), F1 - score (92.56%), and mean intersection - over - union (90.56% MIOU) on the public International Society for Photogrammetry and Remote Sensing (ISPRS) dataset, while having shorter training and response times and higher quality values.