SNrram: an Efficient Sparse Neural Network Computation Architecture Based on Resistive Random-Access Memory.

Peiqi Wang,Yu Ji,Chi Hong,Yongqiang Lyu,Dongsheng Wang,Yuan Xie
DOI: https://doi.org/10.1109/dac.2018.8465793
2018-01-01
Abstract:The sparsity in the deep neural networks can be leveraged by methods such as pruning and compression to help the efficient deployment of large-scale deep neural networks onto hardware platforms, such as GPU or FPGA, for better performance and power efficiency. However, for RRAM crossbar-based architectures, the study of efficient methods to consider the network sparsity is still in the early stage. In this study, we propose SNrram, an efficient sparse neural network computation architecture using RRAM, by exploiting the sparsity in both weights and activation. SNrram stores nontrivial weights and organizes them to eliminate zero-value multiplications for better resource utilization. Experimental results show that SNrram can save RRAM resources by 69.8%, reduce the power consumption by 35.9%, and speed up by 2.49×on popular deep learning benchmarks, compared to a state-of-the-art RRAM-based neural network accelerator.
What problem does this paper attempt to address?