Energy-Efficient And High-Throughput Fpga-Based Accelerator For Convolutional Neural Networks

Gan Feng,Zuyi Hu,Song Chen,Feng Wu
DOI: https://doi.org/10.1109/icsict.2016.7998996
2016-01-01
Abstract:Convolutional Neural Networks (CNN) is widely applied in modern machine learning and pattern recognition area. Not only performance, more and more attention is paid on energy efficienct and scalable devices like FPGA as a better solution than CPU and GPU. In this paper, we propose methods to optimize CNN by fixed-point quantization, activation function approximation, loops and tasks pipelining and parallelization, memory reorganization, and implement an energy-efficient and high-throughput FPGA-based CNN accelerator for LeNet-5 based on Zynq-7000 platform. The accelerator can run at 166MHz and achieve a low error rate of 0.99%, the same as software implementations, and has 37% higher throughput and 93.7% less energy dissipation than a GPU implementation.
What problem does this paper attempt to address?