3D-NWA: A Nested-Winograd Accelerator for 3D CNNs

Huafeng Ye,Huipeng Deng,Jian Wang,Mingyu Wang,Zhiyi Yu
DOI: https://doi.org/10.1109/icta56932.2022.9963033
2022-01-01
Abstract:3D Convolutional neural networks (3D CNNs) perform better in some scenarios, such as video understanding and 3D medical image diagnosis. With the increase in the dimension and size of the convolution kernel, CNN's computational complexity and implementation difficulty increase severely. Winograd transformation can significantly reduce the number of multiplications in convolution operations. However, large convolution filters will bring numerical instability. In this article, we presented a novel method called 3D nested Winograd algorithm to address the problem. Compared with the state-of-art OLA-Winograd algorithm, the proposed algorithm reduces the multiplications by 1.72 to 5.83× for computing 5 × 5 × 5 to 9 × 9 × 9 convolutions. Finally, we demonstrate the efficiency of 3D-NWA on the FPGA platform (Xilinx VCU118) and achieve highest DSP efficiency up to 4.67× compared with the state-of-art accelerators.
What problem does this paper attempt to address?