Exploiting Neural-Network Statistics for Low-Power DNN Inference

Lennart Bamberg,Ardalan Najafi,Alberto Garcia-Ortiz
DOI: https://doi.org/10.48550/arXiv.2311.05557
2023-11-10
Abstract:Specialized compute blocks have been developed for efficient DNN execution. However, due to the vast amount of data and parameter movements, the interconnects and on-chip memories form another bottleneck, impairing power and performance. This work addresses this bottleneck by contributing a low-power technique for edge-AI inference engines that combines overhead-free coding with a statistical analysis of the data and parameters of neural networks. Our approach reduces the interconnect and memory power consumption by up to 80% for state-of-the-art benchmarks while providing additional power savings for the compute blocks by up to 39%. These power improvements are achieved with no loss of accuracy and negligible hardware cost.
Machine Learning,Hardware Architecture
What problem does this paper attempt to address?