Multi-Scale Feature Fusion Attention Network for Building Extraction in Remote Sensing Images

Jia Liu,Hang Gu,Zuhe Li,Hongyang Chen,Hao Chen

DOI: https://doi.org/10.3390/electronics13050923

IF: 2.9

2024-02-29

Electronics

Abstract:The efficient semantic segmentation of buildings in high spatial resolution remote sensing images is a technical prerequisite for land resource management, high-precision mapping, construction planning and other applications. Current building extraction methods based on deep learning can obtain high-level abstract features of images. However, the extraction of some occluded buildings is inaccurate, and as the network deepens, small-volume buildings are lost and edges are blurred. Therefore, we introduce a multi-resolution attention combination network, which employs a multiscale channel and spatial attention module (MCAM) to adaptively capture key features and eliminate irrelevant information, which improves the accuracy of building extraction. In addition, we present a layered residual connectivity module (LRCM) to enhance the expression of information at different scales through multi-level feature fusion, significantly improving the understanding of context and the capturing of fine edge details. Extensive experiments were conducted on the WHU aerial image dataset and the Massachusetts building dataset. Compared with state-of-the-art semantic segmentation methods, this network achieves better building extraction results in remote sensing images, proving the effectiveness of the method.

engineering, electrical & electronic,computer science, information systems,physics, applied

What problem does this paper attempt to address?

The paper attempts to address the problem of efficiently performing semantic segmentation of buildings in high-resolution remote sensing images. Current deep learning-based building extraction methods, although capable of obtaining high-level abstract features of images, are inaccurate when dealing with some occluded buildings; as the network deepens, small buildings are lost, and edges become blurred. Therefore, this paper proposes a Multi-Scale Feature Fusion Attention Network, aiming to improve the accuracy of building extraction by introducing Multi-Scale Channel and Spatial Attention Module (MCAM) and Layered Residual Connection Module (LRCM), particularly enhancing the ability to capture small targets and complex edge features. Specifically, the main contributions of the paper include: 1. **Multi-Resolution Attention Network**: This network structure fully utilizes the attention mechanism, focusing on the most critical information, and effectively solves the problem of boundary blurring during segmentation through the fusion of multi-scale information. 2. **Multi-Scale Channel and Spatial Attention Module (MCAM)**: Designed for the characteristics of remote sensing building images, it can adaptively capture key features and eliminate irrelevant information, allowing the model to selectively focus on the critical parts of the image, thereby improving the accuracy of building extraction. 3. **Layered Residual Connection Module (LRCM)**: By integrating features from multiple levels, it enhances the expression ability of information at different scales, not only improving the understanding of context but also achieving significant results in capturing fine edge details and integrating high-level abstract features, effectively enhancing the performance of the building extraction model. Through these innovations, the paper aims to overcome the limitations of existing methods in handling high-resolution building segmentation tasks, particularly in maintaining high resolution while more effectively highlighting target buildings and improving segmentation performance.

Multi-Scale Feature Fusion Attention Network for Building Extraction in Remote Sensing Images

Multi-Scale Attention Network for Building Extraction from High-Resolution Remote Sensing Images

HCRB-MSAN: Horizontally Connected Residual Blocks-Based Multiscale Attention Network for Semantic Segmentation of Buildings in HSR Remote Sensing Images

ACMFNet: Attention-Based Cross-Modal Fusion Network for Building Extraction of Remote Sensing Images

Building Multi-Feature Fusion Refined Network for Building Extraction from High-Resolution Remote Sensing Images

Building Extraction from Very High Resolution Aerial Imagery Using Joint Attention Deep Neural Network

MAFF-HRNet: Multi-Attention Feature Fusion HRNet for Building Segmentation in Remote Sensing Images

A Multi-Scale Edge Constraint Network for the Fine Extraction of Buildings from Remote Sensing Images

Extracting Buildings from Remote Sensing Images Using a Multitask Encoder-Decoder Network with Boundary Refinement

Building Extraction From High Spatial Resolution Remote Sensing Images of Complex Scenes by Combining Region-Line Feature Fusion and OCNN

Multiregion Scale-Aware Network for Building Extraction From High-Resolution Remote Sensing Images

Multi-Level Perceptual Network for Urban Building Extraction from High-Resolution Remote Sensing Images

Multi-scale attention integrated hierarchical networks for high-resolution building footprint extraction

A Building Extraction Method for High-Resolution Remote Sensing Images with Multiple Attentions and Parallel Encoders Combining Enhanced Spectral Information

Cross-level and multiscale CNN-Transformer network for automatic building extraction from remote sensing imagery

A Multi-Branch Feature Fusion Network for Building Detection in Remote Sensing Images

EMAFF-Net: an enhanced multi-scale attentive feature fusion network for building extraction from VHR remote sensing images

CSA-Net: Complex Scenarios Adaptive Network for Building Extraction for Remote Sensing Images

MSFTrans: a multi-task frequency-spatial learning transformer for building extraction from high spatial resolution remote sensing images

Multiscale Attention Fusion Graph Network for Remote Sensing Building Change Detection

MSRF-Net: Multiscale Receptive Field Network for Building Detection From Remote Sensing Images