Multi-Objective Detection of Traffic Scenes Based on Improved SSD

Hua Xia,Wang Xinqing,Wang Dong,Ma Zhaoye,Shao Faming
DOI: https://doi.org/10.3788/aos201838.1215003
2018-01-01
Acta Optica Sinica
Abstract:Aiming at the problem that the accuracy and real-time of multi-target detection in complex and large scenes arc difficult to balance in the existing target detection algorithms, we imitate the human visual mechanism inspired by the convolution kernel shape of the deep neural network. The target detection framework-the single shot multi-box detection (SSD) based on deep learning is improved, and a multi-target detection framework adaptive perceive SSD is proposed, which is specially used for the multi-target detection in complex and large traffic scenes. A feature convolution kernel library composed of multi-form Gabor and color Gabor is designed. The optimal feature extraction convolution kernel group is trained and screened to replace the low-level convolution kernel group of the original network, and effectively improves the detection accuracy. A single image detection framework is combined with a convolution long-short-term memory network, and the temporal association of network frame-level information is realized by extracting the characteristic mapping between propagation frames with a bottleneck-long-term and short-term memory layer. And the calculation cost is reduced, and the tracking and identification of targets affected by the strong interference in the video arc realized. An adaptive threshold strategy is added to reduce the rate of missing and false alarms. The experimental results show that compared with other target detection frameworks based on deep learning, the average accuracy of various target recognition is increased by 9%similar to 16%, the average accuracy is increased by 14%similar to 21%, the multi-target detection rate is increased by 21%similar to 36%, and the detection frame rate reaches 32 frames.s(-1), which achieves a balance between the accuracy and real-time performance of the algorithm and achieves better detection and recognition results.
What problem does this paper attempt to address?