Speck: A Smart event-based Vision Sensor with a low latency 327K Neuron Convolutional Neuronal Network Processing Pipeline

Ole Richter,Yannan Xing,Michele De Marchi,Carsten Nielsen,Merkourios Katsimpris,Roberto Cattaneo,Yudi Ren,Yalun Hu,Qian Liu,Sadique Sheik,Tugba Demirci,Ning Qiao
DOI: https://doi.org/10.1038/s41467-024-47811-6
2024-05-27
Abstract:Edge computing solutions that enable the extraction of high-level information from a variety of sensors is in increasingly high demand. This is due to the increasing number of smart devices that require sensory processing for their application on the edge. To tackle this problem, we present a smart vision sensor System on Chip (SoC), featuring an event-based camera and a low-power asynchronous spiking Convolutional Neural Network (sCNN) computing architecture embedded on a single chip. By combining both sensor and processing on a single die, we can lower unit production costs significantly. Moreover, the simple end-to-end nature of the SoC facilitates small stand-alone applications as well as functioning as an edge node in larger systems. The event-driven nature of the vision sensor delivers high-speed signals in a sparse data stream. This is reflected in the processing pipeline, which focuses on optimising highly sparse computation and minimising latency for 9 sCNN layers to 3.36{\mu}s for an incoming event. Overall, this results in an extremely low-latency visual processing pipeline deployed on a small form factor with a low energy budget and sensor cost. We present the asynchronous architecture, the individual blocks, and the sCNN processing principle and benchmark against other sCNN capable processors.
Neural and Evolutionary Computing,Machine Learning,Image and Video Processing
What problem does this paper attempt to address?
The paper aims to address the issues of real-time processing with low latency and low energy consumption in edge computing for intelligent sensors. Specifically, the authors propose an intelligent vision sensor system (System on Chip, SoC) that combines an event-driven camera with a low-power asynchronous pulse convolutional neural network (sCNN) architecture. By integrating sensing and processing on a single chip, production costs can be significantly reduced, and the design of small standalone applications is simplified. Additionally, this event-driven vision sensor can provide high-speed signals in the form of sparse data streams, optimizing highly sparse computations and minimizing latency. The core contributions of the paper are: 1. **Integrated Design**: Integrating the event-driven camera with the sCNN processor on the same chip to achieve an end-to-end solution from sensing to processing. 2. **Low Latency and Low Energy Consumption**: Achieving low-latency processing through asynchronous design (each event takes only 3.36 microseconds to pass through 9 layers of sCNN) while maintaining low energy consumption. 3. **High Throughput**: The system can process multiple events in parallel, achieving high throughput (approximately 30 million events/second). 4. **Flexibility and Scalability**: The SoC can be used in small standalone applications or as an edge node in larger systems. In this way, the paper addresses the latency issues of traditional frame-based image sensors in real-time processing and provides an efficient, low-power solution.