Cross Guided and Pyramid Aggregation Networks for Real-time Semantic Segmentation

Yuhang Liao,Lianghua He,Yingjun Deng,Lei Gu,Longsheng Wei
DOI: https://doi.org/10.1109/cac57257.2022.10055986
2022-01-01
Abstract:Recent advances have achieved a significant leap forward in image semantic segmentation with deep neural networks. Still, high precision tends to rely on rich contextual information and high-resolution spatial detail, both of which incur high computational costs. Balancing high accuracy with high efficiency is a challenging problem, particularly in road scenes requiring low latency. This research employs a lightweight model for real-time semantic segmentation, which we refer to as CPANet. Specifically, for multi-scale information extraction of low-resolution features, we use the Deep Aggregation Pyramid Pooling Module (DAPPM) cascade structure to enrich the feature representation. In addition, to further enhance the efficiency, we gradually reduced the number of channels in the decoding process and utilized Cross Guided Module (CGM) for different aggregate levels of feature information. Finally, we use an auxiliary training strategy in the training phase, which enables us to enhance the network’s performance without additional inference overhead. Extensive evaluations have shown that our method performs superior in accuracy when compared to other methods. On the Cityscapes test set, the proposed models have achieved 72.5% mIoU/234.5 FPS and 78.0% mIoU/84.55 FPS on NVIDIA GTX 1080Ti.
What problem does this paper attempt to address?