Energy-Efficient Architecture for FPGA-based Deep Convolutional Neural Networks with Binary Weights

Yunzhi Duan,Shuai Li,Ruipeng Zhang,Qi Wang,Jienan Chen,Gerald E. Sobelman
DOI: https://doi.org/10.1109/icdsp.2018.8631596
2018-01-01
Abstract:This paper presents an energy-efficient, deep parallel Convolutional Neural Network (CNN) accelerator. By adopting a recently proposed binary weight method, the CNN computations are converted into multiplication free processing. To allow parallel accessing and storing of data, we use two RAM banks, where each bank is composed of N RAM blocks corresponding to N-parallel processing. We also design a reconfigurable CNN computing unit in a divide-and-reuse to support a variable-size convolutional filter. Compared with full precision computing on the MNIST and CIFAR-10 classification tasks, the inference Top-1 accuracy of the binary weight CNN has dropped by 1.21% and 1.34%, respectively. The hardware implementation results show that the proposed design can achieve 2100 GOPs with a 4.6 millisecond processing latency. The deep parallel accelerator exhibits 3X energy efficiency compared to a GPU-based design.
What problem does this paper attempt to address?