Fine-grained Traffic Video Vehicle Recognition Based Orientation Estimation and Temporal Information

Anqi Hu,Zhengxing Sun,Qian Li,Yechao Xu,Yihuan Zhu,Sheng Zhang
DOI: https://doi.org/10.1007/s11042-022-13811-1
IF: 2.577
2022-01-01
Multimedia Tools and Applications
Abstract:In this paper, we propose a method for fine-grained vehicle recognition in traffic surveillance video. Compared with general theory about single image fine-grained recognition, this method focuses on multi-frame information combination and the viewpoint changes across videos. Firstly, we detect vehicle instances and their local frames in input traffic video by vehicle tracking. For each vehicle instance, pose estimation is used to extract the 3D orientation in corresponding frame. We encode the 3D orientation as an extra supervising clue, and merge it with CNN feature to show the appearance information and changes in moving process. In addition, recurrent neural network (RNN) is proposed to select abundant information over traffic video and fuse CNN feature of each vehicle frames into comprehensive feature which includes not only spatial information but also temporal information for fine-grained recognition. We do our experiments on the personal CarVideo dataset which collected by surveillance cameras and the open dataset BoxCar116k for performance evaluation. The experiments show that our method outperforms the state-of-the-art methods for fine-grained recognition in traffic video application.
What problem does this paper attempt to address?