Work-in-Progress: BloCirNN: an Efficient Software/hardware Codesign Approach for Neural Network Accelerators with Block-Circulant Matrix

Yunji Qin,Lei Gong,Zhenrong Zheng,Chao Wang
DOI: https://doi.org/10.1109/codes-isss55005.2022.00010
2022-01-01
Abstract:Nowadays, the scale of deep neural networks is getting larger and larger. These large-scale deep neural networks are both compute and memory intensive. To overcome these problems, we use block-circulant weight matrices and Fast Fourier Transform (FFT) to compress model and optimize computation. Compared to weight pruning, this method does not suffer from irregular networks. The main contributions of this paper include the implementation of a convolution module and a fully-connected module with High-Level Synthesis (HLS), deployment and performance test on FPGA platform. We use AlexNet as a case study, which demonstrates our design is more efficient than the FPGA2016.
What problem does this paper attempt to address?