DARSegNet: A Real-Time Semantic Segmentation Method Based on Dual Attention Fusion Module and Encoder-Decoder Network

Yongfeng Xing,Luo Zhong,Xian Zhong
DOI: https://doi.org/10.1155/2022/6195148
IF: 1.43
2022-06-08
Mathematical Problems in Engineering
Abstract:The convolutional neural network achieves excellent semantic segmentation results in artificially annotated datasets with complex scenes. However, semantic segmentation methods still suffer from several problems such as low use rate of the features, high computational complexity, and being far from practical real-time application, which bring about challenges for the image semantic segmentation. Two factors are very critical to semantic segmentation task: global context and multilevel semantics. However, generating these two factors will always lead to high complexity. In order to solve this, we propose a novel structure, dual attention fusion module (DAFM), by eliminating structural redundancy. Unlike most of the existing algorithms, we combine the attention mechanism with the depth pyramid pool module (DPPM) to extract accurate dense features for pixel labeling rather than complex expansion convolution. Specifically, we introduce a DPPM to execute the spatial pyramid structure in output and combine the global pool method. The DAFM is introduced in each decoder layer. Finally, the low-level features and high-level features are fused to obtain semantic segmentation result. The experiments and visualization results on Cityscapes and CamVid datasets show that, in real-time semantic segmentation, we have achieved a satisfactory balance between accuracy and speed, which proves the effectiveness of the proposed algorithm. In particular, on a single 1080ti GPU computer, ResNet-18 produces 75.53% MIoU at 70 FPS on Cityscapes and 73.96% MIoU at 109 FPS on CamVid.
engineering, multidisciplinary,mathematics, interdisciplinary applications
What problem does this paper attempt to address?