Heterogeneous Computing for CNN

Huizi Zhang,Chang Wu,Xiao Hu
DOI: https://doi.org/10.1109/asicon.2017.8252453
2017-01-01
Abstract:As a typical machine learning algorithm, convolutional neural network (CNN) has drawn great interests in academic research and industrial applications. However, traditional CPU can no longer meet the computation requirement of CNN due to CPU's sequential computing nature. Heterogeneous computing combines CPU together with GPGPU or FPGAs to form a much more powerful computation platform. In this paper, we present our study on an implementation of CNN on heterogeneous computing systems, and it shows more than 3x runtime speedup. Our study shows systematically combine the high speed of CPU and parallel computing of FPGA, one can achieve better computation speed than CPU or FPGA alone. We also propose a systematic analysis method to partition an algorithm into software implementation (on CPU) and hardware implementation (on FPGA) and derive near optimal solution for heterogeneous computing.
What problem does this paper attempt to address?