Pedestrian Detection Using Multi-Channel Visual Feature Fusion by Learning Deep Quality Model.

Peijia Yu,Yong Zhao,Jing Zhang,Xiaoyao Xie
DOI: https://doi.org/10.1016/j.jvcir.2019.102579
IF: 2.887
2019-01-01
Journal of Visual Communication and Image Representation
Abstract:Object detection has been widely applied in modern intelligent systems, especially using convolutional neural networks (CNNs). Pedestrian detection is a key technique in video surveillance, which could automatically locate special pedestrian. However, conventional CNN based methods such as Fast/Faster R-CNN cannot handle pedestrian detection effectively due to the extremely similar of positives and hard negatives. In this paper, in order to solve hard negative problem in pedestrian detection, we incorporate classifier enhancement and representational ability of CNNs. More specifically, we first fuse multi-channel visual features (color, texture, semantic) for quality assessment. Then, we propose “Reduction-adjustment” (RA) block which can enhance feature extraction and can be flexibly embedded into CNNs. In our implementation, we embed RA blocks into a base model such as VGG 16. Afterwards, we apply Faster R-CNN as a detection system to classify and locate pedestrians. Extensive experiments on Caltech, ETH and CityPersons datasets demonstrate that our deep model is feasible and effective for pedestrian detection.
What problem does this paper attempt to address?