FWUA : A Flexible Winograd-Based Uniform Accelerator for 1D/2D/3D CNNs
Jian Wang,Huipeng Deng,Huafeng Ye,Shanlin Xiao,Zhiyi Yu
DOI: https://doi.org/10.1109/icta53157.2021.9661647
2021-01-01
Abstract:Convolutional neural networks (CNNs) have proven to be promising in various applications such as audio recognition, image classification, and video understanding. Different dimensions of CNNs (e.g., 1D, 2D, and 3D CNNs) are proposed to adapt to these applications. To accelerate different dimensional convolution, a uniform accelerator is necessary. Nevertheless, the implementation poses a significant challenge due to several observations. Firstly, computational complexity, network mapping methods, and data reuse strategies vary greatly among different dimensional convolutional neural networks. Secondly, various efficient algorithms such as Winograd have been proposed to accelerate CNNs, but their implementations lack flexible support for different network types. Typically, the Winograd-base accelerator is designed for 1-stride and the non-1-stride methods haven’t been implemented on 3D CNNs. To address these challenges, we propose a flexible Winograd-based uniform accelerator (FWUA) for 1D/2D/3D CNNs. With adaptive support for different dimensions, strides, and filter sizes, FWUA is runtime-reconfigurable for different dimensions of CNNs applications, i.e., audio, image, and video. The FWUA is verified on the Xilinx ZCU102 evaluation board FPGA. Our design achieves 1.51/1.13/0.66 (GOPS/DSP) DSP-efficiency and 242/181/105 (GOPS/W) energy-efficiency in C3D, VGG-16, and HAR-CNNs, which are up to 2x comparing to state-of-the-art FPGA works.