Abstract:In computer vision, the task of semantic segmentation is crucial for applications such as autonomous driving and intelligent surveillance. However, achieving a balance between real-time performance and segmentation accuracy remains a significant challenge. Although Fast-SCNN is favored for its efficiency and low computational complexity, it still faces difficulties when handling complex street scene images. To address this issue, this paper presents an improved Fast-SCNN, aiming to enhance the accuracy and efficiency of semantic segmentation by incorporating a novel attention mechanism and an enhanced feature extraction module. Firstly, the integrated SimAM (Simple, Parameter-Free Attention Module) increases the network's sensitivity to critical regions of the image and effectively adjusts the feature space weights across channels. Additionally, the refined pyramid pooling module in the global feature extraction module captures a broader range of contextual information through refined pooling levels. During the feature fusion stage, the introduction of an enhanced DAB (Depthwise Asymmetric Bottleneck) block and SE (Squeeze-and-Excitation) attention optimizes the network's ability to process multi-scale information. Furthermore, the classifier module is extended by incorporating deeper convolutions and more complex convolutional structures, leading to a further improvement in model performance. These enhancements significantly improve the model's ability to capture details and overall segmentation performance. Experimental results demonstrate that the proposed method excels in processing complex street scene images, achieving a mean Intersection over Union (mIoU) of 71.7% and 69.4% on the Cityscapes and CamVid datasets, respectively, while maintaining inference speeds of 81.4 fps and 113.6 fps. These results indicate that the proposed model effectively improves segmentation quality in complex street scenes while ensuring real-time processing capabilities.

EHANet: Efficient Hybrid Attention Network Towards Real-time Semantic Segmentation

Research on Efficient Asymmetric Attention Module for Real-Time Semantic Segmentation Networks in Urban Scenes

Real-Time Semantic Segmentation With Fast Attention

ELANet: Effective Lightweight Attention-Guided Network for Real-Time Semantic Segmentation

Hybrid Dilated Convolution Network Using Attentive Kernels for Real-Time Semantic Segmentation

A Fast Attention-Guided Hierarchical Decoding Network for Real-Time Semantic Segmentation

DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network

Embedded Attention Network for Semantic Segmentation

EMFANet: a lightweight network with efficient multi-scale feature aggregation for real-time semantic segmentation

Using Channel-Wise Attention for Deep CNN Based Real-Time Semantic Segmentation With Class-Aware Edge Information

ELANet: an efficiently lightweight asymmetrical network for real-time semantic segmentation

AM‐MulFSNet: A Fast Semantic Segmentation Network Combining Attention Mechanism and Multi‐branch

MEDANet: More Efficient Dual Attention Network for Scene Segmentation

HSNet: an Intelligent Hierarchical Semantic-Aware Network System for Real-Time Semantic Segmentation

Real-Time Semantic Segmentation Algorithm for Street Scenes Based on Attention Mechanism and Feature Fusion

Efficient Dense Modules of Asymmetric Convolution for Real-Time Semantic Segmentation

AttaNet: Attention-Augmented Network for Fast and Accurate Scene Parsing

Real-Time High-Performance Semantic Image Segmentation of Urban Street Scenes

Attention based lightweight asymmetric network for real-time semantic segmentation

M-FasterSeg: An efficient semantic segmentation network based on neural architecture search

Hierarchical Shared Architecture Search for Real-Time Semantic Segmentation of Remote Sensing Images