Exploiting Context Based on CNN and Coding Representations for Pedestrian Co-Detection

Linfeng Jiang,Jinsheng Ji,Weilin Zhong,Tao Zhang,Huilin Xiong
DOI: https://doi.org/10.1007/s11042-018-6806-7
IF: 2.577
2018-01-01
Multimedia Tools and Applications
Abstract:The exploitation of contextual information among multiple images has been proven significant to improve detection performance by object co-detection methods. In this paper, we propose a pedestrian co-detection method that combines the strengths of convolutional neural networks (CNNs) and locality-constrained linear coding (LLC) in a unified conditional random field (CRF) model. First, we obtain object candidates by using a region proposal network (RPN) in Faster R-CNN. Second, we build a fully connected CRF that consists of unary potentials on individual object candidates and two types of pairwise potentials on pairs of object candidates. The unary potential is computed independently for each object candidate by using the baseline method. The pairwise potentials consist of multiscale CNN and LLC representation-based potentials, which contribute to the capturing of relationships among object candidates in all the test images. Finally, we jointly predict the category labels of all the object candidates through the mean field inference in the CRF. We evaluated the proposed method on the ETH, Caltech, and INRIA Pedestrian datasets. The experimental results demonstrate the effectiveness of the proposed method as compared to the baseline method.
What problem does this paper attempt to address?