Hardware Acceleration of Convolutional Neural Network Based on 3D-Cube Structure

SUI Yuanfeng,CHANG Liang,ZHAO Simeng,CHANG Yuchun
DOI: https://doi.org/10.19304/j.cnki.issn1000-7180.2021.08.006
2021-01-01
Abstract:Traditional convolutional neural network requires a large number of computing units and too much data access, resulting in slow calculation speed and low efficiency. A new data block structure is designed to make full use of data multiplexing, greatly reducing the number of data reading and fully calling the parallel computing resources of the FPGA. In this way, multiple multiplication and addition operations are carried out simultaneously, to realize an efficient parallel convolution calculation circuit. The weight and bias parameters are separately fused, optimized and quantized to reduce memory usage. By using VGG16 as the test network, when identifying the Imagenet data set, the accuracy was only lost by 0.02%. In the case of 200 MHz, the throughput rate reached 129.6 GOPS and the power consumption was only 5.2 6W.
What problem does this paper attempt to address?