Multifeature Selective Fusion Network for Real-Time Driving Scene Parsing

Yu Pei,Bin Sun,Shutao Li
DOI: https://doi.org/10.1109/TIM.2021.3070611
IF: 5.6
2021-01-01
IEEE Transactions on Instrumentation and Measurement
Abstract:Real-time driving scene parsing using semantic segmentation is an essential yet challenging task for an autonomous driving system, where both efficiency and accuracy need to be considered simultaneously. In this article, we propose an efficient and high-performance deep neural network called feature selective fusion network (FSFnet) for robust semantic segmentation of road scenes. Since the complex driving scene parsing usually requires the fusion of features in different levels or scales, we propose a feature selective fusion module (FSFM) to adaptively merge these features by generating correlated weight maps in both spatial and channelwise. Furthermore, a multiscale context enhancement module is designed based on an asymmetric nonlocal neural network to aggregate both multiscale and global context information. The proposed FSFnet obtains precise segmentation results in real time on Cityscapes and CamVid data sets. Specifically, the architecture achieves 77.1% mean pixel intersection-over-union (mIoU) on the Cityscapes test set at a speed of 53 frames per second (FPS) for a 1024 x 2048 input and 75.1% mIoU on the CamVid test set at a speed of 123 FPS for a 960 x 720 input on a single NVIDIA 2080 TI GPU.
What problem does this paper attempt to address?