CMPF-UNet: a ConvNeXt multi-scale pyramid fusion U-shaped network for multi-category segmentation of remote sensing images

Ning Li,Xiaopeng Yu,Miao Yu

DOI: https://doi.org/10.1080/10106049.2024.2311217

IF: 3.45

2024-02-16

Geocarto International

Abstract:Most U-shaped convolutional neural network (CNN) methods have the problems of insufficient feature extraction and fail to fully utilize global/multi-scale context information, which makes it difficult to distinguish similar objects and shadow occluded objects in remote sensing images. This article proposes a ConvNeXt multi-scale pyramid fusion U-shaped network (CMPF-UNet). In this work, we first propose a novel backbone network based on ConvNeXt to enhance image feature extraction, and use ConvNeXt bottleneck blocks to reconstruct the decoder. Furthermore, a scale aware pyramid fusion (SAPF) module and Residual Atrous Spatial Pyramid Pooling (RASPP) module are proposed to dynamically fuse the rich multi-scale context information in advanced features. Finally, multiple Global Pyramid Guidance (GPG) modules are embedded in the network, aiming to provide different levels of global context information for the decoder by reconstructing skip-connections. Experiments on the Vaihingen and Potsdam datasets indicate that the proposed CMPF-UNet segmentation achieves more accurate results.

geosciences, multidisciplinary,environmental sciences,remote sensing,imaging science & photographic technology

What problem does this paper attempt to address?

### Problems Addressed by the Paper This paper aims to address several key issues in semantic segmentation of high-resolution remote sensing images, particularly the challenges faced in distinguishing ground object categories with similar textures or those obscured by shadows. Specifically: 1. **Insufficient Feature Extraction**: - Most methods based on U-shaped Convolutional Neural Networks (CNNs) lack sufficient capability in feature extraction and fail to fully utilize global/multi-scale contextual information, making it difficult to distinguish similar objects and shadowed objects in remote sensing images. 2. **Improving Segmentation Performance**: - A new ConvNeXt Multi-scale Pyramid Fusion U-shaped Network (CMPF-UNet) is proposed to improve the segmentation accuracy of high-resolution remote sensing images. 3. **Multi-scale Contextual Information Fusion**: - By introducing the Global Pyramid Guidance (GPG) module, the improved Scale-Aware Pyramid Fusion (SAPF) module, and the Residual Atrous Spatial Pyramid Pooling (RASPP) module, multi-scale contextual information is effectively fused. 4. **Addressing Limitations of Traditional Methods**: - Traditional image segmentation methods rely on manually designed features and cannot achieve high precision and full automation. While CNN-based deep learning methods have advantages in this regard, they still have some shortcomings, such as the loss of edge and contour details. Through these improvements, CMPF-UNet aims to enhance the segmentation performance of high-resolution remote sensing images, performing better in scenarios with similar materials and shadow occlusions. Experimental results show that this method achieves highly competitive performance on the Vaihingen and Potsdam datasets.

CMPF-UNet: a ConvNeXt multi-scale pyramid fusion U-shaped network for multi-category segmentation of remote sensing images

Multi-Scale Cross-Attention Fusion Network Based on Image Super-Resolution

FCPFNet: Feature Complementation Network with Pyramid Fusion for Semantic Segmentation

CTMFNet: CNN and Transformer Multiscale Fusion Network of Remote Sensing Urban Scene Imagery

Enhanced Multi-Scale Feature Adaptive Fusion Sparse Convolutional Network for Large-Scale Scenes Semantic Segmentation

Multi-Modal Image Fusion Via Deep Laplacian Pyramid Hybrid Network

Multi-Field Context Fusion Network for Semantic Segmentation of High-Spatial-Resolution Remote Sensing Images

Context Multi-scale Fusion Network for Magnetic Resonance Image

CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation

MM-FPN: Multi-path and Multi-scale Feature Pyramid Network for Object Detection

Alzheimer disease: A quantitative trait approach to GWAS pays dividends

CIMFNet: Cross-layer Interaction and Multiscale Fusion Network for Semantic Segmentation of High-Resolution Remote Sensing Images

Adaptive Pyramid Context Fusion for Point Cloud Perception

A Crossmodal Multiscale Fusion Network for Semantic Segmentation of Remote Sensing Data

EFCNet: Ensemble Full Convolutional Network for Semantic Segmentation of High-Resolution Remote Sensing Images

Multi-scale Feature Extraction and Fusion Net: Research on UAVs Image Semantic Segmentation Technology

Encoder- and Decoder-Based Networks Using Multiscale Feature Fusion and Nonlocal Block for Remote Sensing Image Semantic Segmentation

Unsupervised Multi-Scale Hybrid Feature Extraction Network for Semantic Segmentation of High-Resolution Remote Sensing Images

MGFN: A Multi-Granularity Fusion Convolutional Neural Network for Remote Sensing Scene Classification

CTFuseNet: A Multi-Scale CNN-Transformer Feature Fused Network for Crop Type Segmentation on UAV Remote Sensing Imagery

Remote Sensing Image Segmentation Using Vision Mamba and Multi-Scale Multi-Frequency Feature Fusion