Abstract:The domain of remote sensing image processing has witnessed remarkable advancements in recent years, with deep convolutional neural networks (CNNs) establishing themselves as a prominent approach for building segmentation. Despite the progress, traditional CNNs, which rely on convolution and pooling for feature extraction during the encoding phase, often fail to precisely delineate global pixel interactions, potentially leading to the loss of vital semantic details. Moreover, conventional CNN-based segmentation models frequently neglect the nuanced semantic differences between shallow and deep features during the decoding phase, which can result in subpar feature integration through rudimentary addition or concatenation techniques. Additionally, the unique boundary characteristics of buildings in remote sensing images, which offer a rich vein of prior information, have not been fully harnessed by traditional CNNs. This paper introduces an innovative approach to building segmentation in remote sensing images through a prior-guided dual branch multi-feature fusion network (PDBMFN). The network is composed of a prior-guided branch network (PBN) in the encoding process, a parallel dilated convolution module (PDCM) designed to incorporate prior information, and a multi-feature aggregation module (MAM) in the decoding process. The PBN leverages prior region and edge information derived from superpixels and edge maps to enhance edge detection accuracy during the encoding phase. The PDCM integrates features from both branches and applies dilated convolution across various scales to expand the receptive field and capture a more comprehensive semantic context. During the decoding phase, the MAM utilizes deep semantic information to direct the fusion of features, thereby optimizing segmentation efficacy. Through a sequence of aggregations, the MAM gradually merges deep and shallow semantic information, culminating in a more enriched and holistic feature representation. Extensive experiments are conducted across diverse datasets, such as WHU, Inria Aerial, and Massachusetts, revealing that PDBMFN outperforms other sophisticated methods in terms of segmentation accuracy. In the key segmentation metrics, including mIoU, precision, recall, and F1 score, PDBMFN shows a marked superiority over contemporary techniques. The ablation studies further substantiate the performance improvements conferred by the PBN’s prior information guidance and the efficacy of the PDCM and MAM modules.

SNNFD, Spiking Neural Segmentation Network in Frequency Domain Using High Spatial Resolution Images for Building Extraction.

NPSFF-Net: Enhanced Building Segmentation in Remote Sensing Images via Novel Pseudo-Siamese Feature Fusion

MSFTrans: a multi-task frequency-spatial learning transformer for building extraction from high spatial resolution remote sensing images

Semantic Segmentation of Urban Buildings from VHR Remote Sensing Imagery Using a Deep Convolutional Neural Network

Automatic Building Extraction on High-Resolution Remote Sensing Imagery Using Deep Convolutional Encoder-Decoder with Spatial Pyramid Pooling.

Architecture of Deep Convolutional Encoder-Decoder Networks for Building Footprint Semantic Segmentation

A Deep Residual Learning Serial Segmentation Network for Extracting Buildings from Remote Sensing Imagery

B-FGC-Net: A Building Extraction Network from High Resolution Remote Sensing Imagery

Multi-scale Full Spike Pattern for Semantic Segmentation

Building NAS: Automatic Designation of Efficient Neural Architectures for Building Extraction in High-Resolution Aerial Images.

SSDBN: A Single-Side Dual-Branch Network with Encoder–Decoder for Building Extraction

Web-Net: A Novel Nest Networks With Ultra-Hierarchical Sampling For Building Extraction From Aerial Imageries

New Building Extraction Method Based on Semantic Segmentation

SPNet: Dual-Branch Network with Spatial Supplementary Information for Building and Water Segmentation of Remote Sensing Images

Building Extraction From High Spatial Resolution Remote Sensing Images of Complex Scenes by Combining Region-Line Feature Fusion and OCNN

A Prior-Guided Dual Branch Multi-Feature Fusion Network for Building Segmentation in Remote Sensing Images

CFNet: An Eigenvalue Preserved Approach to Multiscale Building Segmentation in High-Resolution Remote Sensing Images

HCRB-MSAN: Horizontally Connected Residual Blocks-Based Multiscale Attention Network for Semantic Segmentation of Buildings in HSR Remote Sensing Images

Semantic Segmentation Network Combined with Edge Detection for Building Extraction in Remote Sensing Images

Decoupling Semantic and Edge Representations for Building Footprint Extraction from Remote Sensing Images.

Extracting Buildings from Remote Sensing Images Using a Multitask Encoder-Decoder Network with Boundary Refinement