A Deep Residual Networks Accelerator on FPGA

YaQian Zhao,Xin Zhang,Xing Fang,Long Li,XueLei Li,ZhenHua Guo,XuChen Liu
DOI: https://doi.org/10.1109/ICACI.2019.8778613
2019-01-01
Abstract:Deep residual networks plays an important role in deep learning and is widely used for image classification due to its high recognition rate. Moreover, with the increase of amount of data in the data center and embedded systems, performance and power consumption becomes the key issue. FPGA is an excellent solution, it's more and more promising to accelerate deep learning inference due to the low latency and low energy consumption. In this paper, we present an OpenCL-based acceleration framework on FPGA for deep residual networks, which shown excellent performance and high energy efficiency ratio. Furthermore, we proposed a new strategy to deal with fully-connected layers, and also proposed an optimization strategy for 1×1 filters. In order to valid our proposal, we evaluate our framework on Intel Arria 10 devices. Evaluation results show that the ResNet50 Network on our framework can achieve a performance of 54img/s or 1.2img/s/W, which is 47% higher than that of the state-of-the- art FPGA-based design on the same device. Moreover, it's also a competitive result compared to NVidia's M4 GPUs.
What problem does this paper attempt to address?