Abstract:A fully convolutional neural network is a powerful end-to-end model that is widely used in the field of semantic segmentation and has achieved great success. Researchers have proposed a series of methods based on a fully convolutional neural network. However, with the continuous subsampling of convolutions and pooling, the image contextual information will be lost, affecting the pixel-level classification. To solve the problem of context loss in a fully convolutional network, a pixel-based attention method is proposed, which calculates the relationship bet-ween high-level feature map pixels to obtain global information and enhance the correlation between pixels com-bined with atrous spatial pyramid pooling to further extract the image feature information. To solve the problem of pixel loss in the high-level feature map of an image, an attention method based on different levels of the image is proposed. This method uses the information in the high-level feature map as a guide to mine the hidden information in the low-level feature map and then fuses it with the high-level feature map to make full use of the high-level feature map and the low-level feature map information. In the experiment, the effectiveness of the proposed method is verified by comparing the effects of different modules on the segmentation results of a fully convolutional neural network. At the same time, experiments are carried out on the recognized image semantic segmentation dataset called Cityscapes and compared with the current advanced networks. The results show that the proposed method has advantages in both objective evaluation indicators and subjective effects, and achieves 69.3% accuracy in the Cityscapes official website test set. The performance is 3 to 5 percentage points higher than that of several recent advanced networks.

AMNet: Convolutional Neural Network embeded with Attention Mechanism for Semantic Segmentation

EHANet: Efficient Hybrid Attention Network Towards Real-time Semantic Segmentation

ACNET: Attention Based Network to Exploit Complementary Features for RGBD Semantic Segmentation.

Semantic segmentation of remote sensing images combined with attention mechanism and feature enhancement U-Net

MEDANet: More Efficient Dual Attention Network for Scene Segmentation

Improved U-NET Semantic Segmentation Network

Embedded Attention Network for Semantic Segmentation

Attention-Guided Network for Semantic Video Segmentation

Multi-level Spatial Attention Network for Image Data Segmentation.

MSANet: an Improved Semantic Segmentation Method Using Multi-Scale Attention for Remote Sensing Images

EANET: Efficient Attention-Augmented Network for Real-Time Semantic Segmentation.

AANet: Adaptive Attention Networks for Semantic Segmentation of High-Resolution Remote Sensing Imagery

DSANet: Dilated Spatial Attention for Real-Time Semantic Segmentation in Urban Street Scenes.

Adaptive multi-scale dual attention network for semantic segmentation

Dual Path Attention Net For Remote Sensing Semantic Image Segmentation

PPNet : Pooling Position Attention Network for Semantic Segmentation

Fully Convolutional Neural Network with Attention Module for Semantic Segmentation

ELANet: Effective Lightweight Attention-Guided Network for Real-Time Semantic Segmentation

Semantic Segmentation Based on Deeplabv3+ and Attention Mechanism

AEDN: Encoder-Decoder Network with Attention for Semantic Image Segmentation

Cross-CBAM: A Lightweight network for Scene Segmentation