Abstract:Semantic segmentation of remote sensing images is a fundamental task in geospatial research. However, widely used Convolutional Neural Networks (CNNs) and Transformers have notable drawbacks: CNNs may be limited by insufficient remote sensing modeling capability, while Transformers face challenges due to computational complexity. In this paper, we propose a remote-sensing image semantic segmentation network named LKASeg, which combines Large Kernel Attention(LSKA) and Full-Scale Skip Connections(FSC). Specifically, we propose a decoder based on Large Kernel Attention (LKA), which extract global features while avoiding the computational overhead of self-attention and providing channel adaptability. To achieve full-scale feature learning and fusion, we apply Full-Scale Skip Connections (FSC) between the encoder and decoder. We conducted experiments by combining the LKA-based decoder with FSC. On the ISPRS Vaihingen dataset, the mF1 and mIoU scores achieved 90.33% and 82.77%.

What problem does this paper attempt to address?

The problems that this paper attempts to solve are several key challenges in semantic segmentation of remote sensing images: 1. **Limitations of traditional Convolutional Neural Networks (CNNs)**: Existing widely - used convolutional neural networks (such as FCN and UNet) have deficiencies when processing remote sensing images, mainly reflected in their limited ability to model complex objects. These models are prone to losing spatial and boundary information during the down - sampling process, resulting in inaccurate segmentation results. 2. **Computational complexity problem of Transformer**: Although Transformer can capture global information through the self - attention mechanism, its computational complexity is high. Especially when processing high - resolution remote sensing images, the computational and memory costs are very large. In addition, the self - attention mechanism mainly focuses on spatial adaptability and ignores channel adaptability. 3. **Scale variation and loss of spatial information**: The scale of objects in remote sensing images varies greatly, and there are many densely arranged similar objects, which makes accurate semantic segmentation difficult. Traditional CNN methods have limited ability to extract features at different scales and are prone to losing spatial detail information. To solve the above problems, the author proposes a new semantic segmentation network for remote sensing images - **LKASeg**. The main innovations of this network include: - **Introduction of a decoder based on Large Kernel Attention (LKA)**: LKA combines the advantages of convolution and self - attention, can extract global features at a lower computational cost, and provides channel adaptability. - **Application of Full - Scale Skip Connections (FSC)**: FSC realizes the learning and fusion of multi - scale features by establishing dense connections between the encoder and the decoder, thus solving the problems of scale variation and spatial information loss. Through these improvements, LKASeg retains the advantages of convolution in network structure design and overcomes the disadvantages of the self - attention mechanism, significantly improving the effect of semantic segmentation of remote sensing images. Experimental results show that on the ISPRS Vaihingen dataset, the mF1 and mIoU scores of LKASeg reach 90.33% and 82.77% respectively, which are significantly improved compared with the baseline model UNetformer. In summary, this paper aims to solve the problems of insufficient feature extraction, high computational complexity, and scale variation and spatial information loss in existing semantic segmentation methods for remote sensing images by introducing LKA and FSC, so as to achieve more accurate segmentation results.

LKASeg:Remote-Sensing Image Semantic Segmentation with Large Kernel Attention and Full-Scale Skip Connections

LSKSANet: A Novel Architecture for Remote Sensing Image Semantic Segmentation Leveraging Large Selective Kernel and Sparse Attention Mechanism

Transformer and CNN Hybrid Deep Neural Network for Semantic Segmentation of Very-High-Resolution Remote Sensing Imagery

An Attention-Fused Network for Semantic Segmentation of Very-High-Resolution Remote Sensing Imagery

Hierarchical Self-Attention Embedded Neural Network With Dense Connection for Remote-Sensing Image Semantic Segmentation

A New Multi-Channel Deep Convolutional Neural Network for Semantic Segmentation of Remote Sensing Image.

Multi-scale Spatial Aggregation Network for Remote Sensing Image Segmentation

A Novel Semantic Segmentation Method for High-Resolution Remote Sensing Images Based on Visual Attention Network.

Accurate Semantic Segmentation in Remote Sensing Image.

A Stage-Adaptive Selective Network with Position Awareness for Semantic Segmentation of LULC Remote Sensing Images

Semantic Segmentation of High-Resolution Remote Sensing Images Using Multiscale Skip Connection Network

Semantic Segmentation With Attention Mechanism for Remote Sensing Images

Dual attention deep fusion semantic segmentation networks of large-scale satellite remote-sensing images

Semantic Segmentation for High-Resolution Aerial Imagery Using Multi-Skip Network and Markov Random Fields

Semi-Supervised Adversarial Semantic Segmentation Network Using Transformer and Multiscale Convolution for High-Resolution Remote Sensing Imagery

High-Resolution Remote Sensing Image Semantic Segmentation via Multiscale Context and Linear Self-Attention

Remote Sensing Image Semantic Segmentation Method Based on a Deep Convolutional Neural Network and Multiscale Feature Fusion

CNN and Transformer Fusion for Remote Sensing Image Semantic Segmentation

An Improved Large Kernel-Based Remote Sensing Land Cover Segmentation Algorithm

MCAFNet: A Multiscale Channel Attention Fusion Network for Semantic Segmentation of Remote Sensing Images

Learnable Gated Convolutional Neural Network for Semantic Segmentation in Remote-Sensing Images