Small Object Augmentation of Urban Scenes for Real-Time Semantic Segmentation

Zhengeng Yang,Hongshan Yu,Mingtao Feng,Wei Sun,Xuefei Lin,Mingui Sun,Zhi-Hong Mao,Ajmal Mian
DOI: https://doi.org/10.1109/TIP.2020.2976856
IF: 10.6
2020-01-01
IEEE Transactions on Image Processing
Abstract:Semantic segmentation is a key step in scene understanding for autonomous driving. Although deep learning has significantly improved the segmentation accuracy, current high-quality models such as PSPNet and DeepLabV3 are inefficient given their complex architectures and reliance on multi-scale inputs. Thus, it is difficult to apply them to real-time or practical applications. On the other hand, existing real-time methods cannot yet produce satisfactory results on small objects such as traffic lights, which are imperative to safe autonomous driving. In this paper, we improve the performance of real-time semantic segmentation from two perspectives, methodology and data. Specifically, we propose a real-time segmentation model coined Narrow Deep Network (NDNet) and build a synthetic dataset by inserting additional small objects into the training images. The proposed method achieves 65.7% mean intersection over union (mIoU) on the Cityscapes test set with only 8.4G floating-point operations (FLOPs) on $1024\times 2048$ inputs. Furthermore, by re-training the existing PSPNet and DeepLabV3 models on our synthetic dataset, we obtained an average 2% mIoU improvement on small objects.
What problem does this paper attempt to address?