Abstract:In computer vision, the task of semantic segmentation is crucial for applications such as autonomous driving and intelligent surveillance. However, achieving a balance between real-time performance and segmentation accuracy remains a significant challenge. Although Fast-SCNN is favored for its efficiency and low computational complexity, it still faces difficulties when handling complex street scene images. To address this issue, this paper presents an improved Fast-SCNN, aiming to enhance the accuracy and efficiency of semantic segmentation by incorporating a novel attention mechanism and an enhanced feature extraction module. Firstly, the integrated SimAM (Simple, Parameter-Free Attention Module) increases the network's sensitivity to critical regions of the image and effectively adjusts the feature space weights across channels. Additionally, the refined pyramid pooling module in the global feature extraction module captures a broader range of contextual information through refined pooling levels. During the feature fusion stage, the introduction of an enhanced DAB (Depthwise Asymmetric Bottleneck) block and SE (Squeeze-and-Excitation) attention optimizes the network's ability to process multi-scale information. Furthermore, the classifier module is extended by incorporating deeper convolutions and more complex convolutional structures, leading to a further improvement in model performance. These enhancements significantly improve the model's ability to capture details and overall segmentation performance. Experimental results demonstrate that the proposed method excels in processing complex street scene images, achieving a mean Intersection over Union (mIoU) of 71.7% and 69.4% on the Cityscapes and CamVid datasets, respectively, while maintaining inference speeds of 81.4 fps and 113.6 fps. These results indicate that the proposed model effectively improves segmentation quality in complex street scenes while ensuring real-time processing capabilities.

Research on Semantic Segmentation Method of Urban Streetscape Image Based on Deep Learning

A Semantic Segmentation Approach Based On Deeplab Network In High-Resolution Remote Sensing Images

Semantic and Instance Segmentation in Coastal Urban Spatial Perception: A Multi-Task Learning Framework with an Attention Mechanism

Real-Time High-Performance Semantic Image Segmentation of Urban Street Scenes

Real-Time Semantic Segmentation Algorithm for Street Scenes Based on Attention Mechanism and Feature Fusion

Fully Convolutional Neural Network with Attention Module for Semantic Segmentation

Semantic segmentation of urban street scene images based on improved U-Net network

Semantic Segmentation for Urban-Scene Images

Semantic Image Segmentation Network Based on Deep Learning

Research on Complex Scene Recognition Based on Semantic Segmentation

Accurate Semantic Segmentation in Remote Sensing Image.

Adaptive multi-scale dual attention network for semantic segmentation

Remote Sensing Image Semantic Segmentation Method Based on a Deep Convolutional Neural Network and Multiscale Feature Fusion

Semantic Segmentation of Urban Airborne LiDAR Point Clouds Based on Fusion Attention Mechanism and Multi-Scale Features

Deep semantic segmentation of unmanned aerial vehicle remote sensing images based on fully convolutional neural network

Semantic segmentation of urban environments: Leveraging U-Net deep learning model for cityscape image analysis

Research on Image Semantic Segmentation Based on Hybrid Cascade Feature Fusion and Detailed Attention Mechanism

Research on Deep Learning-based Image Semantic Segmentation and Scene Understanding

Based on cross-scale fusion attention mechanism network for semantic segmentation for street scenes

DSANet: Dilated Spatial Attention for Real-Time Semantic Segmentation in Urban Street Scenes.

Research on semantic segmentation algorithm of multispectral remote sensing image based on deep learning