Pedestrian detection with high-resolution event camera

Piotr Wzorek,Tomasz Kryjak

DOI: https://doi.org/10.34658/9788366741928.7

2023-05-29

Abstract:Despite the dynamic development of computer vision algorithms, the implementation of perception and control systems for autonomous vehicles such as drones and self-driving cars still poses many challenges. A video stream captured by traditional cameras is often prone to problems such as motion blur or degraded image quality due to challenging lighting conditions. In addition, the frame rate - typically 30 or 60 frames per second - can be a limiting factor in certain scenarios. Event cameras (DVS -- Dynamic Vision Sensor) are a potentially interesting technology to address the above mentioned problems. In this paper, we compare two methods of processing event data by means of deep learning for the task of pedestrian detection. We used a representation in the form of video frames, convolutional neural networks and asynchronous sparse convolutional neural networks. The results obtained illustrate the potential of event cameras and allow the evaluation of the accuracy and efficiency of the methods used for high-resolution (1280 x 720 pixels) footage.

Computer Vision and Pattern Recognition,Image and Video Processing

What problem does this paper attempt to address?

The paper primarily explores the issue of pedestrian detection using high-resolution event cameras. Traditional cameras may encounter problems such as motion blur or image quality degradation under fast motion or poor lighting conditions. As an emerging technology, event cameras can better address these challenges. The paper compares two deep learning-based methods to process event data for pedestrian detection tasks: 1. **Method based on video frame representation**: This method accumulates event data within a defined time window to form a data structure similar to traditional video frames and inputs it into Convolutional Neural Networks (CNNs). The study employs the YOLOv7 architecture and combines different event features (such as polarity, temporal resolution, and frequency) to maximize the amount of information. This method achieves good detection accuracy (67.7% mAP@0.5, 38% mAP@.5:.95) but has high computational complexity (104.7 GFLOPs). 2. **Method based on Asynchronous Sparse Convolutional Neural Networks (ASCNNs)**: This method leverages the sparse nature of event data, updating only the convolution results corresponding to the changing input values to reduce computational complexity and energy consumption. Although this method theoretically reduces computational demands (205 MFLOPs), it did not achieve satisfactory accuracy levels in experiments. In summary, this study aims to evaluate the accuracy and efficiency of different methods in processing high-resolution event camera data and to provide direction for the future development of more efficient and accurate pedestrian detection systems.

Pedestrian detection with high-resolution event camera

Traffic Sign Detection With Event Cameras and DCNN

Event-Based Pedestrian Detection Using Dynamic Vision Sensors

Rethinking Human Pose Estimation for Autonomous Driving with 3D Event Representations.

Mixed Frame-/Event-Driven Fast Pedestrian Detection

Research, Applications and Prospects of Event-Based Pedestrian Detection: A Survey

Low-latency automotive vision with event cameras

Event-Based Vision Enhanced: A Joint Detection Framework in Autonomous Driving

Data-Driven Technology in Event-Based Vision

E-detector: Asynchronous Spatio-temporal for Event-based Object Detection in Intelligent Transportation System

Eventmd: High-Speed Moving Object Detection Based on Event-Based Video Frames

Low-Latency Line Tracking Using Event-Based Dynamic Vision Sensors

Event-based Moving Object Detection and Tracking

Near-Chip Dynamic Vision Filtering for Low-Bandwidth Pedestrian Detection

FAST-Dynamic-Vision: Detection and Tracking Dynamic Objects with Event and Depth Sensing

High Frame Rate Video Reconstruction and Deblurring Based on Dynamic and Active Pixel Vision Image Sensor

Towards Real-Time Fast Unmanned Aerial Vehicle Detection Using Dynamic Vision Sensors

Real-Time Multi-Task Facial Analytics With Event Cameras

Pedestrian intention prediction in Adverse Weather Conditions with Spiking Neural Networks and Dynamic Vision Sensors

EventHDR: from Event to High-Speed HDR Videos and Beyond

From Dense to Sparse: Low-Latency and Speed-Robust Event-Based Object Detection