A Reconfigurable Accelerator Based on Fast Winograd Algorithm for Convolutional Neural Network in Internet of Things

Chen Yang,YiZhou Wang,XiaoLi Wang,Li Geng
DOI: https://doi.org/10.1109/icsict.2018.8565722
2018-01-01
Abstract:Convolutional neural networks (CNNs) is an effective method widely applied in various compute vision tasks. However, it is increasingly difficult to take efficiency and configurability into account simultaneously as CNNs become more and more complicated, especially on Internet of Things (IoT) devices. To address this problem, this paper proposed a Winograd based reconfigurable architecture (WRA) for CNN acceleration. Via fully exploring the parallelism of Winograd algorithm, computational complexity can be largely reduced. Furthermore, a highly reusable data buffer architecture was proposed to realize maximal data reuse and minimal external memory access, which contributes to eliminate the bottleneck of memory bandwidth. Besides, the WRA accelerator can be dynamically reconfigurable so that most of CNNs models can be implemented quickly. WRA was implemented using Xilinx XC7Z102 platform and run at 150MHz clock frequency. Under the test of ShuffleNet, the WRA accelerator achieved 2137.2 GOP/s average convolution performance and 82.39 GOPS/W energy efficiency, which can satisfy both of the performance requirement and power limitation for most IoT scenarios.
What problem does this paper attempt to address?