An Enhanced Encoder-Decoder Network Architecture for Reducing Information Loss in Image Semantic Segmentation

Zijun Gao,Qi Wang,Taiyuan Mei,Xiaohan Cheng,Yun Zi,Haowei Yang

2024-05-26

Abstract:The traditional SegNet architecture commonly encounters significant information loss during the sampling process, which detrimentally affects its accuracy in image semantic segmentation tasks. To counter this challenge, we introduce an innovative encoder-decoder network structure enhanced with residual connections. Our approach employs a multi-residual connection strategy designed to preserve the intricate details across various image scales more effectively, thus minimizing the information loss inherent to down-sampling procedures. Additionally, to enhance the convergence rate of network training and mitigate sample imbalance issues, we have devised a modified cross-entropy loss function incorporating a balancing factor. This modification optimizes the distribution between positive and negative samples, thus improving the efficiency of model training. Experimental evaluations of our model demonstrate a substantial reduction in information loss and improved accuracy in semantic segmentation. Notably, our proposed network architecture demonstrates a substantial improvement in the finely annotated mean Intersection over Union (mIoU) on the dataset compared to the conventional SegNet. The proposed network structure not only reduces operational costs by decreasing manual inspection needs but also scales up the deployment of AI-driven image analysis across different sectors.

Image and Video Processing,Computer Vision and Pattern Recognition

What problem does this paper attempt to address?

The paper aims to address the significant information loss issue present in traditional SegNet architectures for image semantic segmentation. Specifically, the paper points out that during the downsampling process in the encoding stage, traditional SegNet architectures lose a substantial amount of information, which affects the accuracy of the segmentation task. To tackle this challenge, the authors propose an innovative encoder-decoder network structure that introduces residual connections to reduce information loss and improve segmentation accuracy. The main improvements include: 1. **Multi-Residual Connection Strategy**: This strategy helps retain more detailed information at different image scales, thereby enhancing the network's ability to perform accurate segmentation. 2. **Improved Cross-Entropy Loss Function**: To enhance the convergence speed during training and address the issue of sample imbalance, the authors designed a modified version of the cross-entropy loss function that includes a balancing factor. This method optimizes the distribution between positive and negative samples, improving the efficiency of model training. Experimental results show that the proposed network architecture not only significantly reduces information loss but also greatly improves the accuracy of semantic segmentation. Particularly, it performs better than traditional SegNet in terms of the mean Intersection over Union (mIoU) metric. Additionally, this method promotes the deployment and application of AI-driven image analysis across various fields by reducing the need for manual inspection.

An Enhanced Encoder-Decoder Network Architecture for Reducing Information Loss in Image Semantic Segmentation

High-Resolution Remote Sensing Image Semantic Segmentation Method Based on Improved Encoder-Decoder Convolutional Neural Network

IIE-SegNet: Deep Semantic Segmentation Network With Enhanced Boundary Based on Image Information Entropy

SegNet Network Architecture for Deep Learning Image Segmentation and Its Integrated Applications and Prospects

Semantic Image Segmentation with Improved Position Attention and Feature Fusion

Attention Guided Global Enhancement and Local Refinement Network for Semantic Segmentation

DSNet:Multi-resolution Dense Encoder and Stack Decoder Network for Aerial Image Segmentation

Image Semantic Segmentation Using Improved ENet Network

DESENet: a bilateral network with detail-enhanced semantic encoder for real-time semantic segmentation

Semantic segmentation for remote sensing images via dense feature extraction and companion loss neural network

LEDNet: A Lightweight Encoder-Decoder Network for Real-Time Semantic Segmentation

Image semantic segmentation based on improved DeepLabv3+ network and superpixel edge optimization

LMANet: A Lightweight Asymmetric Semantic Segmentation Network Based on Multi-Scale Feature Extraction

Narrowing the semantic gaps in U-Net with learnable skip connections: The case of medical image segmentation

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation

Lightweight and Progressively-Scalable Networks for Semantic Segmentation

An Improved Encoder–Decoder Network for Ore Image Segmentation

Spatial-Assistant Encoder-Decoder Network for Real Time Semantic Segmentation

Edge-Enhanced GCIFFNet: A Multiclass Semantic Segmentation Network Based on Edge Enhancement and Multiscale Attention Mechanism