Design of High Performance RNN Accelerator Based on Network Compression

Wentao Zhu,Yuhao Sun,Zeyu Shen,Haichuan Yang,Yu Gong,Bo Liu
DOI: https://doi.org/10.1109/iccs51219.2020.9336599
2020-01-01
Abstract:As the size of the neural networks expands, the computation and storage consumption of recurrent neural networks is increasing. To solve this problem, this paper proposes a recurrent neural network accelerator which can reduce computation redundancy, memory overhead and energy consumption. A novel network compression method based on pruning and hybgrid quantization is also proposed to reduce computation and memory overhead. Based on the designs above, a precision adaptive approximate calculation based accelerator is designed to achieve high energy efficiency. The experimental results show that under the TSMC 28nm process, when the data bit width is 4bit and the working voltage is 0.8V, the peak performance of proposed accelerator is not reduced, the power consumption is 38.4mW, and the energy efficiency is 2.7TOPs/W. The energy efficiency of proposed accelerator is 2.5 times more than that of the state-of-the-art recurrent neural network accelerators.
What problem does this paper attempt to address?