Abstract:Accurately extracting buildings from high-resolution remote sensing images is crucial for human productivity and livelihood in urban areas. Due to varying scales and indistinct boundaries of buildings, it is crucial to fully leverage the high- and low-frequency features in building extraction from remote sensing images. However, previous studies have solely relied on either low- or high-frequency features, leading to errors such as omissions or internal holes in the detected buildings at various scales. Although some studies have considered the integration between both high- and low-frequency features, they overlook the suitability of different network depths for extracting different frequency features. A novel network called Cascaded Inception Conv-Former Network (CICF-Net) is proposed in this study to solve these problems. It leverages the parallel combination of convolutional neural network and Transformer to efficiently extract high- and low-frequency features for building extraction. In the encoder, as the network depth grows, we gradually reduce the contribution of high-frequency branch and enhance the focus on low-frequency branch. Moreover, a cascaded fusion strategy is employed to extract and integrate multiscale high- and low-frequency features. Meanwhile, we propose gated convolution UperNet as the decoder, which utilizes recursive gated convolution to facilitate multilevel spatial interactions and better restoration of fine-grained spatial details for building segmentation. The proposed CICF-Net achieves competitive accuracies on three public benchmarks: Massachusetts Building Dataset, WHU Aerial Building Dataset, and Inria Aerial Image Labeling Dataset, with IoU of 75.17%, 91.45%, and 81.28%, respectively. This provides strong evidence of its effectiveness in building extraction, as it can accurately capture spatial details and context of buildings.

CEDNet: A cascade encoder–decoder network for dense prediction

CEDNet: A Cascade Encoder-Decoder Network for Dense Prediction

Deep Neural Network Acceleration with Sparse Prediction Layers

DiCENet: Dimension-wise Convolutions for Efficient Networks

Gated Recurrent Fusion UNet for Depth Completion

IDK Cascades: Fast Deep Learning by Learning not to Overthink

Predictive Coding Based Multiscale Network with Encoder-Decoder LSTM for Video Prediction

CasCIFF: A Cross-Domain Information Fusion Framework Tailored for Cascade Prediction in Social Networks

Hierarchical Information Enhancement Network for Cascade Prediction in Social Networks

Bio-inspired feature cascade network for edge detection

DDCNet: Deep Dilated Convolutional Neural Network for Dense Prediction

Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction

Information Cascade Prediction of complex networks based on Physics-informed Graph Convolutional Network

CFIDNet: cascaded feature interaction decoder for RGB-D salient object detection

DHFNet: Decoupled Hierarchical Fusion Network for RGB-T dense prediction tasks

CASCADE: A Framework for CNN Accelerator Synthesis with Concatenation and Refreshing Dataflow

Tripartite Feature Enhanced Pyramid Network for Dense Prediction

PredCNN: Predictive Learning with Cascade Convolutions

A Cascaded Network With Coupled High-Low Frequency Features for Building Extraction

CasNet: a cascade coarse-to-fine network for semantic segmentation

FDNet: A Deep Learning Approach with Two Parallel Cross Encoding Pathways for Precipitation Nowcasting