Low-latency automotive vision with event cameras

Daniel Gehrig,Davide Scaramuzza
DOI: https://doi.org/10.1038/s41586-024-07409-w
IF: 64.8
2024-05-30
Nature
Abstract:The computer vision algorithms used currently in advanced driver assistance systems rely on image-based RGB cameras, leading to a critical bandwidth–latency trade-off for delivering safe driving experiences. To address this, event cameras have emerged as alternative vision sensors. Event cameras measure the changes in intensity asynchronously, offering high temporal resolution and sparsity, markedly reducing bandwidth and latency requirements 1 . Despite these advantages, event-camera-based algorithms are either highly efficient but lag behind image-based ones in terms of accuracy or sacrifice the sparsity and efficiency of events to achieve comparable results. To overcome this, here we propose a hybrid event- and frame-based object detector that preserves the advantages of each modality and thus does not suffer from this trade-off. Our method exploits the high temporal resolution and sparsity of events and the rich but low temporal resolution information in standard images to generate efficient, high-rate object detections, reducing perceptual and computational latency. We show that the use of a 20 frames per second (fps) RGB camera plus an event camera can achieve the same latency as a 5,000-fps camera with the bandwidth of a 45-fps camera without compromising accuracy. Our approach paves the way for efficient and robust perception in edge-case scenarios by uncovering the potential of event cameras 2 .
multidisciplinary sciences
What problem does this paper attempt to address?
This paper proposes a solution to the trade-off problem between bandwidth and delay in autonomous driving systems with visual sensors. The current advanced driver assistance systems rely on RGB image cameras, which leads to a contradiction of higher frame rates decreasing delay but increasing bandwidth requirements. To address this issue, the researchers introduce a new hybrid event and frame-based object detection method that combines standard convolutional neural networks (CNN) for image processing and efficient asynchronous graph neural networks (GNN) for event processing. The main contribution of the paper is the design of a hybrid detector that utilizes the high temporal resolution and sparsity of event cameras along with the rich but low temporal resolution information from images. This approach reduces perception and computation delays while maintaining efficiency. Experimental results show that by combining a 20 fps RGB camera with an event camera, it is possible to achieve delay comparable to a 5,000 fps camera while maintaining similar bandwidth to a 45 fps camera, without sacrificing accuracy. The paper also demonstrates the impact of different time intervals on object detection performance, proving that event data can provide additional time to update predictions and enhance safety, especially in the detection of fast-moving or deforming objects. Furthermore, the proposed system finds a better balance between bandwidth and performance, outperforming existing methods that solely use images or events. Future research directions may include fusion with other sensors such as LiDAR. Overall, this paper aims to improve the safety and efficiency of autonomous driving systems by innovatively fusing event and frame technologies, reducing delay, and lowering bandwidth requirements.