SACANet: scene-aware class attention network for semantic segmentation of remote sensing images

Xiaowen Ma,Rui Che,Tingfeng Hong,Mengting Ma,Ziyan Zhao,Tian Feng,Wei Zhang

2023-04-22

Abstract:Spatial attention mechanism has been widely used in semantic segmentation of remote sensing images given its capability to model long-range dependencies. Many methods adopting spatial attention mechanism aggregate contextual information using direct relationships between pixels within an image, while ignoring the scene awareness of pixels (i.e., being aware of the global context of the scene where the pixels are located and perceiving their relative positions). Given the observation that scene awareness benefits context modeling with spatial correlations of ground objects, we design a scene-aware attention module based on a refined spatial attention mechanism embedding scene awareness. Besides, we present a local-global class attention mechanism to address the problem that general attention mechanism introduces excessive background noises while hardly considering the large intra-class variance in remote sensing images. In this paper, we integrate both scene-aware and class attentions to propose a scene-aware class attention network (SACANet) for semantic segmentation of remote sensing images. Experimental results on three datasets show that SACANet outperforms other state-of-the-art methods and validate its effectiveness. Code is available at <a class="link-external link-https" href="https://github.com/xwmaxwma/rssegmentation" rel="external noopener nofollow">this https URL</a>.

Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper aims to address two main issues in semantic segmentation of remote sensing images: 1. **Insufficient Scene Awareness**: Traditional spatial attention mechanisms, while capturing direct relationships between pixels, overlook the understanding of pixels in the context of the global scene background (i.e., the global contextual information of the scene where the pixels are located and their relative positions). To improve this, the authors propose a module based on an improved spatial attention mechanism embedded with scene awareness (SAA), effectively utilizing the spatial correlation of ground objects. 2. **Large Intra-class Variation and Background Noise**: Remote sensing images typically have complex backgrounds and significant intra-class differences. Conventional attention mechanisms, due to dense similarity calculations, tend to introduce excessive background noise and struggle to handle intra-class variability. To address this, the researchers introduce a Local-global Class Attention mechanism, which associates pixels with global class representations through local class representations as intermediary elements, thereby achieving efficient and accurate class-level context modeling. Combining the above two points, the authors propose a network that integrates scene awareness and class attention—SACANet, to improve the performance of semantic segmentation in remote sensing images. Experimental results show that SACANet outperforms existing state-of-the-art methods on three benchmark datasets and achieves a good balance between accuracy and efficiency.

SACANet: scene-aware class attention network for semantic segmentation of remote sensing images

MSANet: an Improved Semantic Segmentation Method Using Multi-Scale Attention for Remote Sensing Images

AANet: Adaptive Attention Networks for Semantic Segmentation of High-Resolution Remote Sensing Imagery

Semantic segmentation of remote sensing images combined with attention mechanism and feature enhancement U-Net

DOCNet: Dual-Domain Optimized Class-Aware Network for Remote Sensing Image Segmentation

A Synergistical Attention Model for Semantic Segmentation of Remote Sensing Images

MASANet: Multi-Angle Self-Attention Network for Semantic Segmentation of Remote Sensing Images

GCSANet: A Global Context Spatial Attention Deep Learning Network for Remote Sensing Scene Classification

Weakly Supervised Semantic Segmentation with Consistency-Constrained Multi-Class Attention for Remote Sensing Scenes

Weakly Supervised Semantic Segmentation With Consistency-Constrained Multiclass Attention for Remote Sensing Scenes

Threshold Attention Network for Semantic Segmentation of Remote Sensing Images

Attention Consistent Network for Remote Sensing Scene Classification

SSCNet: A Spectrum-Space Collaborative Network for Semantic Segmentation of Remote Sensing Images

Hybrid Multiple Attention Network for Semantic Segmentation in Aerial Images

Multiattention Network for Semantic Segmentation of Fine-Resolution Remote Sensing Images

Multi-level Spatial Attention Network for Image Data Segmentation.

Attention Based Network for Remote Sensing Scene Classification.

ACANet: Across-Scale Context Attention Network for Real-Time Semantic Segmentation

Semantic Segmentation With Attention Mechanism for Remote Sensing Images

Multi-Attention-Based Semantic Segmentation Network for Land Cover Remote Sensing Images

XANet: An Efficient Remote Sensing Image Segmentation Model Using Element-Wise Attention Enhancement and Multi-Scale Attention Fusion