Abstract:Building extraction from high-resolution images has been a fundamental task in the remote sensing field. It helps in monitoring natural disasters and developing urban areas. Encoder–Decoder based convolutional neural network (CNN) has provided a paradigm for automatic building extraction. However, extracting building information is difficult due to many reasons like diverse scales, complex background and variety of building structures. Moreover, achieving accurate boundary information remains challenging due to various impediments surrounding buildings. To deal with these challenges, in this article, we proposed a dual-branch model. One branch is the segmentation branch that includes an encoder–decoder framework (based on Attention-ResUNet architecture) combining residual unit and attention network, to generate the segmentation mask. The residual unit improves the ability to learn the deep and complex building features whereas the attention network focuses on the informative spatial information. In addition, a dilated module is positioned at the end of the decoder of Attention-ResUNet to capture the multiscale information. Another branch is the edge branch consisting of canny edge extraction, morphological operation and squeeze-excitation network, to improve the boundary information. The canny edge detection method extracts the edges of the buildings which is further enhanced through the morphological operation. In addition, a squeeze-excitation network is added for fine adjustment of generated feature maps. At the end, our proposed model integrates the segmentation mask obtained using the segmentation branch and boundary information generated by the edge branch to produce the refined segmentation mask. Experiments have been performed on the Massachusetts building dataset and the WHU-I building dataset. The performance of proposed model is compared with state-of-the-art models such as SegNet, DeepLabV3Plus, UNet, Attention-UNet, ResUNet and Attention-ResUNet. The results demonstrate that the proposed approach improves the performance for both the datasets. Hence, we can conclude that the proposed approach has a great potential in extracting multiscale information and enhancing the boundary information of buildings.

EMAFF-Net: an enhanced multi-scale attentive feature fusion network for building extraction from VHR remote sensing images

ACMFNet: Attention-Based Cross-Modal Fusion Network for Building Extraction of Remote Sensing Images

Multi-Scale Feature Fusion Attention Network for Building Extraction in Remote Sensing Images

Building Multi-Feature Fusion Refined Network for Building Extraction from High-Resolution Remote Sensing Images

Building Extraction from Very High Resolution Aerial Imagery Using Joint Attention Deep Neural Network

A Hybrid Attention-Aware Fusion Network (HAFNet) for Building Extraction from High-Resolution Imagery and LiDAR Data

Effective Building Extraction From High-Resolution Remote Sensing Images With Multitask Driven Deep Neural Network

Extracting Buildings from Remote Sensing Images Using a Multitask Encoder-Decoder Network with Boundary Refinement

B-FGC-Net: A Building Extraction Network from High Resolution Remote Sensing Imagery

MSRF-Net: Multiscale Receptive Field Network for Building Detection From Remote Sensing Images

Building Extraction From High Spatial Resolution Remote Sensing Images of Complex Scenes by Combining Region-Line Feature Fusion and OCNN

EU-Net: an Efficient Fully Convolutional Network for Building Extraction from Optical Remote Sensing Images.

CMGFNet: A deep cross-modal gated fusion network for building extraction from very high-resolution remote sensing images

Improved Building Extraction from Remotely Sensed Images by Integration of Encode–Decoder and Edge Enhancement Models

Multi-Scale Attention Network for Building Extraction from High-Resolution Remote Sensing Images

BRRNet: A Fully Convolutional Neural Network for Automatic Building Extraction From High-Resolution Remote Sensing Images

Hierarchical Disentangling Network for Building Extraction from Very High Resolution Optical Remote Sensing Imagery

A cross-stage features fusion network for building extraction from remote sensing images

ME-FCN: A Multi-Scale Feature-Enhanced Fully Convolutional Network for Building Footprint Extraction

NPSFF-Net: Enhanced Building Segmentation in Remote Sensing Images via Novel Pseudo-Siamese Feature Fusion

MAFF-HRNet: Multi-Attention Feature Fusion HRNet for Building Segmentation in Remote Sensing Images