Efficient Reconfigurable Hardware Core for Convolutional Neural Networks.

Haonan Wang,Jun Lin,Yi Xie,Bo Yuan,Zhongfeng Wang
DOI: https://doi.org/10.1109/acssc.2018.8645259
2018-01-01
Abstract:The Convolutional Neural Network (CNN) is one of the most promising methods in modern machine learning, but its intensive requirement of computing resources limits the application on embedded systems. Since the energy consumption of a CNN is dominated by convolutions, methods such as Winograd and fast FIR algorithms (FFA) are introduced to reduce the computation complexity of convolutions. However, hardware implementations of these algorithms suffer from the reduction of efficiency when processing different CNN models, because their fixed architectures can not efficiently support all sizes of convolution kernels. In this paper, for the first time, we propose an FFA-based all-size Reconfigurable Convolution Core (RCC) to tackle this problem. The proposed RCC can efficiently perform 5 mainstream sizes of convolution kernels, while achieving significant computation complexity reduction compared with the conventional convolution architecture. Considering the strict resource budget of embedded systems, we explore a large design space to obtain an optimal tradeoff between hardware utilization and reconfigurability. Moreover, we propose an overlapping dataflow scheme for the RCC to reduce the workload of the communication bandwidth. The synthesis result shows that the proposed design can run over 600MHz.
What problem does this paper attempt to address?