DLAU: A Scalable Deep Learning Accelerator Unit on FPGA.

Chao Wang,Qi Yu,Lei Gong,Xi Li,Yuan Xie,Xuehai Zhou
DOI: https://doi.org/10.1109/TCAD.2016.2587683
IF: 2.9
2017-01-01
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Abstract:As the emerging field of machine learning, deep learning shows excellent ability in solving complex learning problems. However, the size of the networks becomes increasingly large scale due to the demands of the practical applications, which poses significant challenge to construct a high performance implementations of deep learning neural networks. In order to improve the performance as well as to maintain the low power cost, in this paper we design deep learning accelerator unit (DLAU), which is a scalable accelerator architecture for large-scale deep learning networks using field-programmable gate array (FPGA) as the hardware prototype. The DLAU accelerator employs three pipelined processing units to improve the throughput and utilizes tile techniques to explore locality for deep learning applications. Experimental results on the state-of-the-art Xilinx FPGA board demonstrate that the DLAU accelerator is able to achieve up to $36.1 {\\times }$ speedup comparing to the Intel Core2 processors, with the power consumption at 234 mW.
What problem does this paper attempt to address?