A Scalable OpenCL-Based FPGA Accelerator for YOLOv2

Ke Xu,Xiaoyun Wang,Dong Wang
DOI: https://doi.org/10.1109/fccm.2019.00058
2019-01-01
Abstract:This paper implements an OpenCL-based FPGA accelerator for YOLOv2 on Arria-10 GX1150 FPGA board. The hardware architecture adopts a scalable pipeline design to support multi-resolution input image, and improves resource utilization by full 8-bit fixed-point computation and CONV+BN+Leaky-ReLU layer fusion technology. The proposed design achieves a peak throughput of 566 GOPs under 190 MHz working frequency. The accelerator could run YOLOv2 inference with 288×288 input resolution and tiny YOLOv2 with 416×416 input resolution at the speed of 35 and 71 FPS, respectively.
What problem does this paper attempt to address?