An FPGA-based HEVC post-processing CNN hardware accelerator

Jun XIA,Lei Qian,Wei YAN,Zhi-lei CHAI
DOI: https://doi.org/10.3969/j.issn.1007-130X.2018.12.005
2019-01-01
Abstract:Aiming at the shortcomings of the post-processing CNN algorithm running on the common platform according to the high-efficiency video code standard, we propose a post-processing convolutional neural network hardware parallel architecture based on field programmable gate array (FPGA) to improve the overall parallelism of the convolution module and the hardware flow of the module by optimizing the concurrent data input and output buffering process.Experiments on 176×144 video streams on the Xilinx ZCU102 show that the proposed CNN hardware accelerator can achieve an equivalent computational performance of 360.5 Gfloating-point operation per second.The computation speed can satisfy81.01 FPS, which is 76.67 times faster than that of the Intel i7-4790 Kwith a clock frequency of 4 Ghz.The speedup is 32.50 times faster than the NVIDIA GeForce GTX 750 Ti.In the calculation of energy efficiency ratio, the proposal's power consumption is 12.095 W, 512.9 times of that of the Intel i7-4790 Kand 125.78 times that of the NVIDIA GeForce GTX 750 Ti.
What problem does this paper attempt to address?