Design and Implementation of YOLOv3-Tiny Accelerator Based on PYNQ-Z2 Heterogeneous Platform

Zhengjie Zhou,Yumei Liu,Yidong Xu
DOI: https://doi.org/10.1145/3443467.3443911
2020-01-01
Abstract:Convolutional Neural Network (CNN) has been widely used in computer vision fields such as image recognition and target detection. However, in the forward reasoning stage, many practical applications often require features of low latency and low power consumption. In order to solve this problem, optimization methods such as channel interleaving, multi-channel transmission, and multi-level union are adopted to design and implement a convolutional neural network system based on the FPGA. After analyzing the performance and resource consumption of the accelerator, the actual transmission delay was also considered to reduce the delay error; input and output modules were added to reduce the time for image preprocessing and postprocessing. In this work, the YOLOv3-Tiny model algorithm was implemented on the Xilinx PYNQ-Z2 (ARM+FPGA) platform. Experimental results show that, compared with the CPU, it is greatly optimized in terms of energy efficiency and time, and it has been improved from some previous works.
What problem does this paper attempt to address?