A FPGA-based Accelerator of Convolutional Neural Network for Face Feature Extraction

Ru Ding,Guangda Su,Guoqiang Bai,Wei Xu,Nan Su,Xingjun Wu
DOI: https://doi.org/10.1109/edssc.2019.8754067
2019-01-01
Abstract:Convolutional Neural Network (CNN) as a typical deep learning model has been widely used to solve many complex problems. However, the computation-intensive convolutional layers and memory-intensive fully connected layers limit the implementation of CNN on embedded platforms. In this paper we proposed a FPGA-based accelerator for face feature extraction, which supports the acceleration of entire CNN. In our design, all the CNN layers are optimized and deployed separately and independently with hand coded Verilog templates instead of basing on high level synthesis (HLS) tool. The RTL-designed layers can use the most optimized parallelism strategy for convolution layer and pipeline structure for convolution layer and pooling layer to achieve high resource utilization. For the fully connected layer, the batch-based method is applied to reduce the number of data access. Moreover, a dynamic fixed-point quantization strategy is adopted to reduce the resource consumption. As a result, a system of “FPGA+ARM” is applied to complete the hardware acceleration of CNN and the precision error is less than 1% compared with software.
What problem does this paper attempt to address?