A method of cross-layer fusion multi-object detection and recognition based on improved faster R-CNN model in complex traffic environment

Cui-jin Li,Zhong Qu,Sheng-ye Wang,Ling Liu
DOI: https://doi.org/10.1016/j.patrec.2021.02.003
IF: 4.757
2021-05-01
Pattern Recognition Letters
Abstract:Improving the detection accuracy and speed is the prerequisite of multi-object recognition in the complex traffic environment. Despite object detection has made significant advances based on deep neural networks, it remains a challenge to focus on small and occlusion objects. We address this challenge by allowing multiscale fusion. We introduce a cross-layer fusion multi-object detection and recognition algorithm based on Faster R-CNN, an approach that the five-layer structure of VGG16 (Visual Geometry Group) is used to obtain more characteristic information. We implement this idea with lateral embedding the 1-1 convolution kernel, max pooling and deconvolution, in conjunction with weighted balanced multi-class cross entropy loss function and Soft-NMS to control the imbalance between difficult and easy samples. Considering the actual situation in a complex traffic environment, we manually label mixed dataset. On Cityscapes and KITTI datasets, experimental results show that the proposed model achieves better effects than the current mainstream object detection models.
computer science, artificial intelligence
What problem does this paper attempt to address?