Deep Event-based Object Detection in Autonomous Driving: A Survey

Bingquan Zhou,Jie Jiang
2024-05-07
Abstract:Object detection plays a critical role in autonomous driving, where accurately and efficiently detecting objects in fast-moving scenes is crucial. Traditional frame-based cameras face challenges in balancing latency and bandwidth, necessitating the need for innovative solutions. Event cameras have emerged as promising sensors for autonomous driving due to their low latency, high dynamic range, and low power consumption. However, effectively utilizing the asynchronous and sparse event data presents challenges, particularly in maintaining low latency and lightweight architectures for object detection. This paper provides an overview of object detection using event data in autonomous driving, showcasing the competitive benefits of event cameras.
Computer Vision and Pattern Recognition
What problem does this paper attempt to address?
This paper focuses on the problem of using deep event-based object detection in autonomous driving. Traditional frame-based cameras face a balance challenge between delay and bandwidth when detecting objects in fast-moving scenes, while event cameras have become promising sensors due to their low latency, high dynamic range, and low power consumption. However, effectively utilizing asynchronous and sparse event data to maintain low latency and lightweight architecture for object detection is a challenge. The paper outlines methods for object detection using event data, including traditional deep neural networks (DNN), biologically inspired spiking neural networks (SNN), graph neural networks (GNN), and multimodal learning. Furthermore, various datasets proposed for object detection with event cameras are discussed. The paper introduces the working principle and data structure of event cameras, and proposes event data preprocessing techniques such as event frames, voxel grid, and learnable representations. In the methodology section, the paper explores methods that adapt to the asynchronous nature of event streams, learn spatiotemporal features using LSTM networks, utilize attention mechanisms, employ techniques inspired by point cloud processing, and employ strategies using GNN and SNN to process event data. These methods aim to capture the spatial and temporal characteristics of event data, improving the efficiency and accuracy of object detection. Furthermore, SNN is particularly attractive in embedded applications due to its low power consumption and suitability for processing sparse binary event data. Through a hybrid approach of SNN and ANN, as well as improved SNN training techniques, more efficient object detection can be achieved. The paper also mentions multimodal fusion, leveraging the advantages of combining RGB images with event data to overcome the limitations of event cameras in capturing static scenes and texture information and improve object detection performance under different conditions. Lastly, the paper lists several event-based object detection datasets, which typically include continuous time series data to fully exploit the rich spatial-temporal semantic information provided by event cameras, supporting research in object detection for autonomous driving scenarios.