Deep Integration: A Multi-Label Architecture for Road Scene Recognition

Long Chen,Wujing Zhan,Wei Tian,Yuhang He,Qin Zou
DOI: https://doi.org/10.1109/tip.2019.2913079
IF: 10.6
2019-01-01
IEEE Transactions on Image Processing
Abstract:Deep convolutional neural networks have been applied by automobile industries, Internet giants, and academic institutes to boost autonomous driving technologies; while progress has been witnessed in environmental perception tasks, such as object detection and driver state recognition, the scene-centric understanding and identification still remain a virgin land. This mainly encompasses two key issues: 1) the lack of shared large datasets with comprehensively annotated road scene information and 2) the difficulty to find effective ways to train networks concerning the bias of category samples, image resolutions, scene dynamics, and capturing conditions. In this paper, we make two contributions: 1) we introduce a large-scale dataset with over 110 k images, dubbed DrivingScene, covering traffic scenarios under different weather conditions, road structures, and environmental instances and driving places, which is the first large-scale dataset for multi-class traffic scenes classification and 2) we propose a multi-label neural network for road scene recognition, which incorporates both single- and multi-class classification modes into a multi-level cost function for training with imbalanced categories and utilizes a deep data integration strategy to improve the classification ability on hard samples. The experimental results on DrivingScene and PASCAL VOC demonstrate the effectiveness of the proposed approach in handling the challenge of data imbalance.
What problem does this paper attempt to address?