EMFANet: a lightweight network with efficient multi-scale feature aggregation for real-time semantic segmentation

Xuegang Hu,Yan Ke
DOI: https://doi.org/10.1007/s11554-024-01421-z
IF: 2.293
2024-02-29
Journal of Real-Time Image Processing
Abstract:In recent years, the performance of real-time semantic segmentation has increasingly become a research focus for real-time applications such as autonomous driving. Although large deep models have excellent segmentation results, their inference speed is slow and the models are complex, which makes them difficult to deploy in practice. To address these problems, a lightweight network with efficient multi-scale feature aggregation for real-time semantic segmentation (EMFANet) is proposed in this paper, which employs the encoder–decoder framework with efficient channel attention mechanism. In EMFANet, the effective symmetric attention residual unit (SARU) is presented to rapidly obtain large amounts of multi-scale contextual information. The lightweight multi-scale information aggregation unit (MIAU) is presented for efficient fusion of multi-scale features. Experimental results on the Cityscapes test set show that EMFANet can obtain 72.1% mean intersection over union (mIoU) and 143 FPS with only 1.03 M parameters. It also has competitive segmentation capability on the low-resolution Camvid test set with a fast inference speed of 357 FPS. EMFANet achieves an outstanding performance balance between segmentation accuracy, inference speed and model size.
computer science, artificial intelligence,engineering, electrical & electronic,imaging science & photographic technology
What problem does this paper attempt to address?