Abstract:Image semantic segmentation is one of the key technologies for intelligent systems to understand natural scenes. As one of the important research directions in the field of visual intelligence, this technology has a wide range of application scenarios in the fields of mobile robots, drones, and intelligent driving. However, in practical applications, there may be problems such as inaccurate prediction of semantic labels, loss of segmented objects and background edge information. This paper proposes an improved semantic segmentation network that combines self-attention module and neural architecture search (NAS) method. The method first uses the NAS method to find a semantic segmentation network with multiple resolution branches. During the search process, the searched network structure is adjusted by combining the self-attention module, and then combined with the semantic segmentation networks searched by different branches to integrate into two semantic segmentation network models with different complexity, and finally integrate two network models with different complexity according to the current general teacher–student framework. The input image will first pass through the high complexity model to obtain more accurate parameters, which will affect the training weight of the student network, then pass the image into the low-complexity model to get the final predicted result. The experimental results on the Cityscapes dataset show that the accuracy of the algorithm is 69.8 %, the inference speed is 166.4 FPS, and the actual image segmentation speed is 48/s. It can optimize edge segmentation for better performance in complex scenes and achieve a good balance between real-time performance and accuracy in practical applications.

Semantic Segmentation Network with Multi-Path Structure, Attention Reweighting and Multi-Scale Encoding

Deep Dual-Stream Network with Scale Context Selection Attention Module for Semantic Segmentation

Adaptive multi-scale dual attention network for semantic segmentation

LMANet: A Lightweight Asymmetric Semantic Segmentation Network Based on Multi-Scale Feature Extraction

A Multi-Step Fusion Network for Semantic Segmentation of High-Resolution Aerial Images

M-FasterSeg: An efficient semantic segmentation network based on neural architecture search

Semantic Segmentation Network Based on Adaptive Attention and Deep Fusion Utilizing a Multi-Scale Dilated Convolutional Pyramid

An Attention-Fused Network for Semantic Segmentation of Very-High-Resolution Remote Sensing Imagery

Multi-Level Aggregation and Recursive Alignment Architecture for Efficient Parallel Inference Segmentation Network

Semantic segmentation using cross-stage feature reweighting and efficient self-attention

Cross-Scale Feature Propagation Network for Semantic Segmentation of High-Resolution Remote Sensing Images

Semantic Image Segmentation with Improved Position Attention and Feature Fusion

Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes

A Deep Semantic Segmentation Network with Semantic and Contextual Refinements

Multi-scale Network with Attentional Multi-resolution Fusion for Point Cloud Semantic Segmentation

BMSeNet: Multiscale Context Pyramid Pooling and Spatial Detail Enhancement Network for Real-Time Semantic Segmentation

DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network

Real-Time Semantic Segmentation via an Efficient Multi-Column Network

Dual-Path Geometry-Aware Network for Semantic Segmentation of High-Resolution Aerial Images

A Fast Attention-Guided Hierarchical Decoding Network for Real-Time Semantic Segmentation

Attention Guided Global Enhancement and Local Refinement Network for Semantic Segmentation