Caffeine: Towards Uniformed Representation and Acceleration for Deep Convolutional Neural Networks.

Chen Zhang,Guangyu Sun,Zhenman Fang,Peipei Zhou,Peichen Pan,Jason Cong
DOI: https://doi.org/10.1145/2966986.2967011
IF: 2.9
2018-01-01
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Abstract:With the recent advancement of multilayer convolutional neural networks (CNN), deep learning has achieved amazing success in many areas, especially in visual content understanding and classification. To improve the performance and energy-efficiency of the computation-demanding CNN, the FPGA-based acceleration emerges as one of the most attractive alternatives. In this paper we design and implement Caffeine, a hardware and software co-designed library to efficiently accelerate the entire CNN on FPGAs. Based on the portable high-level synthesis, Caffeine provides a design automation flow that optimizes and generates FPGA-based AI hardware and runtime software codes. We integrate Caffeine into the industry-standard software deep learning framework.Caffeine achieves a peak performance of 365 GOPS on Xilinx KU060 FPGA and 636 GOPS on Virtex7 690t FPGA, showing up to 7.3x and 43.5x performance and energy gains over Caffe on a 12-core Xeon server, and 1.5x better energy-efficiency over the GPU.
What problem does this paper attempt to address?