Voltage-Controlled Magnetic Tunnel Junction based ADC-less Global Shutter Processing-in-Pixel for Extreme-Edge Intelligence

Md Abdullah-Al Kaiser,Gourav Datta,Jordan Athas,Christian Duffee,Ajey P. Jacob,Pedram Khalili Amiri,Peter A. Beerel,Akhilesh R. Jaiswal
2024-10-14
Abstract:The vast amount of data generated by camera sensors has prompted the exploration of energy-efficient processing solutions for deploying computer vision tasks on edge devices. Among the various approaches studied, processing-in-pixel integrates massively parallel analog computational capabilities at the extreme-edge, i.e., within the pixel array and exhibits enhanced energy and bandwidth efficiency by generating the output activations of the first neural network layer rather than the raw sensory data. In this article, we propose an energy and bandwidth efficient ADC-less processing-in-pixel architecture. This architecture implements an optimized binary activation neural network trained using Hoyer regularizer for high accuracy on complex vision tasks. In addition, we also introduce a global shutter burst memory read scheme utilizing fast and disturb-free read operation leveraging innovative use of nanoscale voltage-controlled magnetic tunnel junctions (VC-MTJs). Moreover, we develop an algorithmic framework incorporating device and circuit constraints (characteristic device switching behavior and circuit non-linearity) based on state-of-the-art fabricated VC-MTJ characteristics and extensive circuit simulations using commercial GlobalFoundries 22nm FDX technology. Finally, we evaluate the proposed system's performance on two complex datasets - CIFAR10 and ImageNet, showing improvements in front-end and communication energy efficiency by 8.2x and 8.5x respectively and reduction in bandwidth by 6x compared to traditional computer vision systems, without any significant drop in the test accuracy.
Hardware Architecture,Image and Video Processing
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the energy - efficiency and bandwidth bottlenecks of traditional computer vision systems on edge devices. Specifically, traditional CMOS image sensors (CIS) convert visual scenes into multi - bit digital data, which requires the use of analog - to - digital converters (ADC) with high energy consumption and latency. Therefore, in traditional computer vision systems, the amount of data per frame (determined by the number of pixels in the camera sensor multiplied by the bit precision of the ADC) must be transmitted off - chip for further back - end processing and artificial intelligence applications. This physically separated sensing and processing method leads to energy, throughput, and bandwidth bottlenecks. To address these bottlenecks, researchers are exploring various solutions to improve energy - efficiency and bandwidth efficiency by bringing computational tasks closer to the sensor array. The solution proposed in this paper is **an ADC - free in - pixel processing architecture based on voltage - controlled magnetic tunnel junctions (VC - MTJ) for global shutter**, aiming to further optimize energy - efficiency and bandwidth efficiency without significantly reducing the accuracy of the final application. ### Main Contributions 1. **High - energy - efficiency in - pixel processing scheme for global shutter**: Utilize advanced nanoscale VC - MTJ with high durability, high - speed writing, non - interfering reading, and non - volatility to achieve large - scale parallel analog computing for extreme edge intelligence. 2. **ADC - free in - pixel architecture**: Calculate the first - layer feature pulses of binary activation neural networks by using passive analog subtraction circuits and take advantage of the storage characteristics of VC - MTJ. 3. **Adjustable mapping scheme**: Reuse the analog subtraction circuit to align the algorithm - equivalent hardware threshold with the switching behavior of VC - MTJ, ensuring high - confidence output activation calculation. 4. **Device - circuit - algorithm co - design framework**: Consider device and hardware constraints, evaluate system performance using the CIFAR10 and ImageNet datasets, and verify using state - of - the - art VC - MTJ device characteristics and HSpice simulations. ### Problems Solved - **Reduce ADC operations**: Through the kernel - level read - out method, each 3x3x3 kernel only requires one conversion step, instead of the maximum of 27 ADC conversions required by other CV methods. - **Reduce bandwidth requirements**: Use single - bit activation pulses instead of multi - bit data, significantly improving bandwidth efficiency. - **Improve energy - efficiency**: Achieve higher energy - efficiency by using passive analog subtraction circuits and threshold operations to replace multi - bit ADCs. - **Reduce rolling shutter effect**: Through global shutter operations and burst - reading schemes, reduce motion blur and improve image quality. In conclusion, this paper proposes an innovative ADC - free, VC - MTJ - based in - pixel processing architecture for global shutter, which solves the energy - efficiency and bandwidth bottleneck problems in traditional computer vision systems while maintaining high accuracy and real - time processing capabilities.