Memory Bandwidth and Energy Efficiency Optimization of Deep Convolutional Neural Network Accelerators.

Zikai Nie,Zhisheng Li,Lei Wang,Shasha Guo,Qiang Dou
DOI: https://doi.org/10.1007/978-981-13-2423-9_2
2018-01-01
Abstract:Deep convolutional neural networks (DNNs) achieve state-of-the-art accuracy but at the cost of massive computation and memory operations. Although highly-parallel devices effectively meet the requirements of computation, energy efficiency is still a tough nut. In this paper, we present two novel computation sequences, NHWCfine and NHWCcoarse, for the DNN accelerators. Then we combine two computation sequences with appropriate data layouts. The proposed modes enable continuous memory access patterns and reduce the number of memory accesses, which is achieved by leveraging and transforming the local data reuse of weights and feature maps in high-dimensional convolutions. Experiments with various convolutional layers show that the proposed modes made up of computing sequences and data layouts are more energy efficient than the baseline mode on various networks. The reduction for total energy consumption is up to 4.10x. The reduction for the off-chip memory access latency is up to 5.11x.
What problem does this paper attempt to address?