Efficient CNN Accelerator on FPGA

S Kala,S Nalesh
DOI: https://doi.org/10.1080/03772063.2020.1821797
IF: 1.8768
2020-09-24
IETE Journal of Research
Abstract:Convolutional neural networks (CNNs) are classical models for computer vision and machine learning applications such as video surveillance, pattern recognition, weather forecasting, traffic, and safety. CNNs involve computationally intensive operations and require huge off-chip memory bandwidth, which makes it a challenging task to deploy on real-time embedded systems. Compared to central processing units and graphic processing units, field programmable gate arrays (FPGA)-based CNNs are gaining popularity owing to their flexibility and efficiency. In this work, we present an efficient CNN accelerator based on blocked Winograd-GEMM architecture with high performance. We implement ResNet-18 CNN model on XC7VX690T FPGA using proposed architecture. This implementation operates at a clock frequency of 200 MHz and gives average throughput of 383 GOPS which is comparable to other state-of-art implementations. This manuscript is an extended version of [S. Kala, J. Mathew, B. R. Jose, and S. Nalesh, “UniWiG: Unified Winograd-GEMM Architecture for Accelerating CNN on FPGAs,” in 2019 32nd International Conference on VLSI Design and 2019 18th International Conference on Embedded Systems (VLSID), Delhi, NCR, India, 2019, pp. 209–214. DOI: 10.1109/VLSID.2019.00055.].
telecommunications,engineering, electrical & electronic
What problem does this paper attempt to address?