Event-based YOLO Object Detection: Proof of Concept for Forward Perception System

Waseem Shariff,Muhammad Ali Farooq,Joe Lemley,Peter Corcoran
DOI: https://doi.org/10.48550/arXiv.2212.07181
2023-01-10
Abstract:Neuromorphic vision or event vision is an advanced vision technology, where in contrast to the visible camera that outputs pixels, the event vision generates neuromorphic events every time there is a brightness change which exceeds a specific threshold in the field of view (FOV). This study focuses on leveraging neuromorphic event data for roadside object detection. This is a proof of concept towards building artificial intelligence (AI) based pipelines which can be used for forward perception systems for advanced vehicular applications. The focus is on building efficient state-of-the-art object detection networks with better inference results for fast-moving forward perception using an event camera. In this article, the event-simulated A2D2 dataset is manually annotated and trained on two different YOLOv5 networks (small and large variants). To further assess its robustness, single model testing and ensemble model testing are carried out.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to use event - based cameras (Event Camera) for roadside object detection in the vehicle forward - sensing system. Specifically, the paper focuses on how to effectively use the data generated by event cameras to train advanced object - detection networks in order to improve the sensing performance in a fast - moving environment. Event cameras are different from traditional visible - light cameras. They are very sensitive to luminance changes in the field of view and can generate data with low latency, high dynamic range, and low power consumption. These characteristics make event cameras very suitable for use in the sensing systems of high - speed mobile devices such as automobiles. The main objective of the paper is to conduct event - based object detection for four common roadside objects (pedestrians, cars, utility poles, and other vehicles) using the state - of - the - art YOLOv5 framework. The research team used the A2D2 public dataset, which contains continuous high - resolution frames. These frames were converted into event frames through simulation techniques, and then these event frames were manually labeled and data - enhanced. Finally, training and testing were carried out on two different versions of the YOLOv5 network (small and large). In this way, the research aims to explore and verify the feasibility and effectiveness of event - based vision technology in practical applications, especially in terms of improving detection accuracy and reducing inference time.