A Lightweight Network with Attention Decoder for Real-Time Semantic Segmentation

Wang Kang,Yang Jinfu,Yuan Shuai,Li Mingai
DOI: https://doi.org/10.1007/s00371-021-02115-4
IF: 2.835
2021-01-01
The Visual Computer
Abstract:As an important task in scene understanding, semantic segmentation requires a large amount of computation to achieve high performance. In recent years, with the rise of autonomous systems, it is crucial to make a trade-off in terms of accuracy and speed. In this paper, we propose a novel asymmetric encoder–decoder network structure to address this problem. In the encoder, we design a Separable Asymmetric Module, which combines depth-wise separable asymmetric convolution with dilated convolution to greatly reduce computation cost while maintaining accuracy. On the other hand, an attention mechanism is also used in the decoder to further improve segmentation performance. Experimental results on CityScapes and CamVid datasets show that the proposed method can achieve a better balance between segmentation precision and speed compared with state-of-the-art semantic segmentation methods. Specifically, our model obtains mean IoU of 72.5% and 66.3% on CityScapes and CamVid test dataset, respectively, with less than 1M parameters.
What problem does this paper attempt to address?