Pie: A Pipeline Energy-Efficient Accelerator for Inference Process in Deep Neural Networks

Yangyang Zhao,Qi Yu,Xuda Zhou,Xuehai Zhou,Chao Wang,Xi Li
DOI: https://doi.org/10.1109/icpads.2016.0141
2016-01-01
Abstract:It has been a new research hot topic to speed up the inference process of deep neural networks (DNNs) by hardware accelerators based on field programmable gate arrays (FPGAs). Because of the layer-wise structure and data dependency between layers, previous studies commonly focus on the inherent parallelism of a single layer to reduce the computation time but neglect the parallelism between layers. In this paper, we propose a pipeline energy-efficient accelerator named PIE to accelerate the DNN inference computation by pipelining two adjacent layers. Through realizing two adjacent layers in different calculation orders, the data dependency between layers can be weakened. As soon as a layer produces an output, the next layer reads the output as an input and starts the parallel computation immediately in another calculation method. In such a way, computations between adjacent layers are pipelined. We conduct our experiments on a Zedboard development kit using Xilinx Zynq-7000 FPGA, compared with Intel Core i7 4.0GHz CPU and NVIDIA K40C GPU. Experimental results indicate that PIE is 4.82x faster than CPU and can reduce the energy consumptions of CPU and GPU by 355.35x and 12.02x respectively. Besides, compared with the none-pipelined method that layers are processed in serial, PIE improves the performance by nearly 50%.
What problem does this paper attempt to address?