ESNreram: an Energy-Efficient Sparse Neural Network Based on Resistive Random-Access Memory.

Zhuoran Song,Yilong Zhao,Yanan Sun,Xiaoyao Liang,Li Jiang
DOI: https://doi.org/10.1145/3386263.3406897
2020-01-01
Abstract:The sparsity in the deep neural networks (DNNs) can be leveraged by methods such as pruning and quantization to assist the energy-efficient deployment of large-scale deep neural networks onto hardware platforms, such as GPU and ASIC, for better performance and power efficiency. However, for the metal-oxide resistive random access memory (ReRAM) architecture, the study of energy-efficient methods still shrink the model size or constrain the precision of DNN by leveraging the DNN sparsity. Due to the circuit features of ReRAM, reading bit-0 naturally consumes less energy than reading bit-1. In this paper, we exploit the fine-grained tuning method on the bit-level to reduce energy consumption of ReRAM. Specifically, we present the gradient-search and the weight-group update algorithm, which can significantly unbalance the proportion of bit-1 and bit-0 inside the weights of DNN with negligible NN accuracy loss. Experiments demonstrate that the percentage of bit-0, in some typical convolutional neural networks (CNNs), increases to 33.8%, with less than 0.5% degradation in NN accuracy. The energy reduction can be up to 65%.
What problem does this paper attempt to address?