Abstract:Deep learning (DL)-based approaches are notable for their ability to establish feature associations without relying on physical constraints, unlike traditional strategies that are complex and dependent on expert experience. However, three main challenges hinder the versatility of semantic segmentation models. First, the targets in these images are dense and exist at varying spatial scales, which imposes higher demands on the model for accurate segmentation across scales. Second, the segmentation of small targets in the images is often overlooked, leading to a compromise between fine segmentation and model efficiency. Lastly, the data-intensive nature of remote sensing images and the resource-intensive operations of large-scale networks impose significant communication and computation burdens on edge devices, which may not have sufficient resources to handle them effectively. To address these challenges, this paper proposes a lightweight semantic segmentation method for remote sensing images to achieve high-precision segmentation for multi-scale targets while maintaining low computational complexity. The main components include: (1) embedding the inverted residual block structure to minimize the number of model parameters and computational costs; (2) introducing the parallel irregular space pyramid pooling module to efficiently aggregate multi-scale contextual information for fine-grained recognition of small targets; and (3) embedding transfer learning into the encoder-decoder structure to speed up the convergence rate and improve multi-scale feature fusion capability, thereby reducing semantic information loss. The proposed lightweight method has been extensively tested on real-world high-resolution remote sensing datasets. It achieved PA, MPA, MIoU, and FWIoU scores of 87.90%, 75.76%, 66.29%, and 78.81% on the Vaihingen dataset; 87.03%, 85.31%, 74.85%, and 77.54% on the Potsdam dataset; and 95.37%, 83.33%, 75.70%, and 91.31% on the Aeroscapes dataset. Compared to other popular semantic segmentation models, the proposed method achieved the highest values in all four evaluation indicators, demonstrating its effectiveness and superiority.

A real-time semantic segmentation model using iteratively shared features in multiple sub-encoders

A Scalable Real-time Semantic Segmentation Network for Autonomous Driving

In Defense Of Multi-Source Omni-Supervised Efficient Convnet For Robust Semantic Segmentation In Heterogeneous Unseen Domains

A new real-time image semantic segmentation framework based on a lightweight deep convolutional encoder-decoder architecture for robotic environment sensing

Self-Learned Feature Reconstruction and Offset-Dilated Feature Fusion for Real-Time Semantic Segmentation

See more than once: Kernel-sharing atrous convolution for semantic segmentation

Feature Reuse and Fusion for Real-time Semantic segmentation

A Two-Pipeline Instance Segmentation Network via Boundary Enhancement for Scene Understanding

Real-Time Joint Semantic Segmentation and Depth Estimation Using Asymmetric Annotations

Real-time efficient semantic segmentation network based on improved ASPP and parallel fusion module in complex scenes

Multi-Level Aggregation and Recursive Alignment Architecture for Efficient Parallel Inference Segmentation Network

MSCFNet: A Lightweight Network with Multi-Scale Context Fusion for Real-Time Semantic Segmentation

Advancing high-resolution remote sensing: a compact and powerful approach to semantic segmentation

EMFANet: a lightweight network with efficient multi-scale feature aggregation for real-time semantic segmentation

S$^2$-FPN: Scale-ware Strip Attention Guided Feature Pyramid Network for Real-time Semantic Segmentation

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Hierarchical Shared Architecture Search for Real-Time Semantic Segmentation of Remote Sensing Images

An Improved Shared Encoder-Based Model for Fast Panoptic Segmentation

Real-time Semantic Segmentation with Context Aggregation Network

DESENet: a bilateral network with detail-enhanced semantic encoder for real-time semantic segmentation

DWRSeg: Rethinking Efficient Acquisition of Multi-scale Contextual Information for Real-time Semantic Segmentation