A Reconfigurable DNN Training Accelerator on FPGA

Jinming Lu,Jun Lin,Zhongfeng Wang
DOI: https://doi.org/10.1109/sips50750.2020.9195234
2020-01-01
Abstract:In recent years, deep neural networks (DNNs) have been widely applied in various tasks, demonstrating outstanding performance. To further outspread in practical applications, the efficient hardware implementation of DNNs is becoming a critical issue. With the rise of online learning, training DNNs on resource-constrained platforms has attracted more attention most recently. In this paper, we propose an FPGA-based accelerator for efficient DNN training. First, a reconfigurable processing element is designed, which is flexible to support various computation patterns during training in a unified architecture. Second, a well optimized architecture is presented to perform the computation of batch normalization layers in different stages. Finally, a prevailing model (ResNet-20) for CIFAR-10 dataset is implemented on Xilinx VC706 platform with our framework. Experimental results show that our design achieves 421 GOPS and 43.18 GOPS/W in terms of throughput and energy efficiency, respectively. The comparison results illustrate that our accelerator significantly outperforms prior works.
What problem does this paper attempt to address?