Design and implementation of FPGA-based deep learning object detection system

Chen Chen,Wei Yan,Jun Xia,Zhilei Chai
DOI: https://doi.org/10.16157/j.issn.0258-7998.190318
2019-01-01
Abstract:Aiming at the problems of higher computational complexity and larger memory requirements of current object detection algorithm, we designed and implemented an FPGA-based deep learning object detection system. We also designed the hardware accelerator corresponding to the YOLOv2-Tiny object detection algorithm, modeled the processing delay of each accelerator module, and describe the design of the convolution module. The experimental results show that it is 5.5x and 94.6x of performance and energy gains respectively when comparing with the software Darknet on an 8-core Xeon server, and 84.8x and 67.5x over the software version on the dual-core ARM cortex-A9 on Zynq. Also, the current design outperforms the previous work in performance.
What problem does this paper attempt to address?