Mixed Frame-/Event-Driven Fast Pedestrian Detection

Zhuangyi Jiang,Pengfei Xia,Kai Huang,Walter Stechele,Guang Chen,Zhenshan Bing,Alois Knoll
DOI: https://doi.org/10.1109/icra.2019.8793924
2019-01-01
Abstract:Pedestrian detection has attracted enormous research attention in the field of Intelligent Transportation System (ITS) due to that pedestrians are the most vulnerable traffic participants. So far, almost all pedestrian detection solutions are based on the conventional frame-based camera. However, they cannot perform very well in scenarios with bad light condition and high-speed motion. In this work, a Dynamic and Active Pixel Sensor (DAVIS), whose two channels concurrently output conventional gray-scale frames and asynchronous low-latency temporal contrast events of light intensity, was first used to detect pedestrians in a traffic monitoring scenario. Data from two camera channels were fed into Convolutional Neural Networks (CNNs) including three YOLOv3 models and three YOLO-tiny models to gather bounding boxes of pedestrians with respective confidence map. Furthermore, a confidence map fusion method combining the CNN-based detection results from both DAVIS channels was proposed to obtain higher accuracy. The experiments were conducted on a custom dataset collected on TUM campus. Benefiting from the high speed, low latency and wide dynamic range of the event channel, our method achieved higher frame rate and lower latency than those only using a conventional camera. Additionally, it reached higher average precision by using the fusion approach.
What problem does this paper attempt to address?