FP-DNN: an Automated Framework for Mapping Deep Neural Networks Onto FPGAs with RTL-HLS Hybrid Templates

Yijin Guan,Hao Liang,Ningyi Xu,Wenqiang Wang,Shaoshuai Shi,Xi Chen,Guangyu Sun,Wei Zhang,Jason Cong
DOI: https://doi.org/10.1109/fccm.2017.25
2017-01-01
Abstract:DNNs (Deep Neural Networks) have demonstrated great success in numerous applications such as image classification, speech recognition, video analysis, etc. However, DNNs are much more computation-intensive and memory-intensive than previous shallow models. Thus, it is challenging to deploy DNNs in both large-scale data centers and real-time embedded systems. Considering performance, flexibility, and energy efficiency, FPGA-based accelerator for DNNs is a promising solution. Unfortunately, conventional accelerator design flows make it difficult for FPGA developers to keep up with the fast pace of innovations in DNNs. To overcome this problem, we propose FP-DNN (Field Programmable DNN), an end-to-end framework that takes TensorFlow-described DNNs as input, and automatically generates the hardware implementations on FPGA boards with RTL-HLS hybrid templates. FP-DNN performs model inference of DNNs with our high-performance computation engine and carefully-designed communication optimization strategies. We implement CNNs, LSTM-RNNs, and Residual Nets with FPDNN, and experimental results show the great performance and flexibility provided by our proposed FP-DNN framework.
What problem does this paper attempt to address?