Multi-Scale Feature Fusion Attention Network for Building Extraction in Remote Sensing Images

Jia Liu,Hang Gu,Zuhe Li,Hongyang Chen,Hao Chen
DOI: https://doi.org/10.3390/electronics13050923
IF: 2.9
2024-02-29
Electronics
Abstract:The efficient semantic segmentation of buildings in high spatial resolution remote sensing images is a technical prerequisite for land resource management, high-precision mapping, construction planning and other applications. Current building extraction methods based on deep learning can obtain high-level abstract features of images. However, the extraction of some occluded buildings is inaccurate, and as the network deepens, small-volume buildings are lost and edges are blurred. Therefore, we introduce a multi-resolution attention combination network, which employs a multiscale channel and spatial attention module (MCAM) to adaptively capture key features and eliminate irrelevant information, which improves the accuracy of building extraction. In addition, we present a layered residual connectivity module (LRCM) to enhance the expression of information at different scales through multi-level feature fusion, significantly improving the understanding of context and the capturing of fine edge details. Extensive experiments were conducted on the WHU aerial image dataset and the Massachusetts building dataset. Compared with state-of-the-art semantic segmentation methods, this network achieves better building extraction results in remote sensing images, proving the effectiveness of the method.
engineering, electrical & electronic,computer science, information systems,physics, applied
What problem does this paper attempt to address?
The paper attempts to address the problem of efficiently performing semantic segmentation of buildings in high-resolution remote sensing images. Current deep learning-based building extraction methods, although capable of obtaining high-level abstract features of images, are inaccurate when dealing with some occluded buildings; as the network deepens, small buildings are lost, and edges become blurred. Therefore, this paper proposes a Multi-Scale Feature Fusion Attention Network, aiming to improve the accuracy of building extraction by introducing Multi-Scale Channel and Spatial Attention Module (MCAM) and Layered Residual Connection Module (LRCM), particularly enhancing the ability to capture small targets and complex edge features. Specifically, the main contributions of the paper include: 1. **Multi-Resolution Attention Network**: This network structure fully utilizes the attention mechanism, focusing on the most critical information, and effectively solves the problem of boundary blurring during segmentation through the fusion of multi-scale information. 2. **Multi-Scale Channel and Spatial Attention Module (MCAM)**: Designed for the characteristics of remote sensing building images, it can adaptively capture key features and eliminate irrelevant information, allowing the model to selectively focus on the critical parts of the image, thereby improving the accuracy of building extraction. 3. **Layered Residual Connection Module (LRCM)**: By integrating features from multiple levels, it enhances the expression ability of information at different scales, not only improving the understanding of context but also achieving significant results in capturing fine edge details and integrating high-level abstract features, effectively enhancing the performance of the building extraction model. Through these innovations, the paper aims to overcome the limitations of existing methods in handling high-resolution building segmentation tasks, particularly in maintaining high resolution while more effectively highlighting target buildings and improving segmentation performance.