An FPGA-based Mix-grained Sparse Training Accelerator

Yingchang Mao,Qiang Liu
DOI: https://doi.org/10.1109/ICFPT59805.2023.00043
2023-01-01
Abstract:Recently training deep neural networks (DNNs) on edge devices has attracted much attention due to its strong adaptability and avoidance of private data transmission. However, DNN training poses challenges for edge devices with limited computational, storage, and energy resources. In this paper, we design an FPGA-based mix-grained sparse training accelerator, which employs row sparsity skipping units for coarse-grained sparsity exploitation and sparse convolution processing elements for finegrained sparsity exploitation. The sparsity of input/weight/output in the three phases of DNN training is fully leveraged. The experimental results show that the proposed accelerator achieves a performance of 214.50 GOPS and an energy efficiency of 28.19 GOPS/W. Compared to a dense training accelerator, it achieves a maximum speedup of 16.6x.
What problem does this paper attempt to address?