An Efficient Accelerator for Sparse Convolutional Neural Networks

Weijie You,Chang Wu
DOI: https://doi.org/10.1109/asicon47005.2019.8983560
2019-01-01
Abstract:In this paper, we propose a sparse convolutional neural network accelerator design on FPGAs. Similar to the DNNWEAVER architecture, our accelerator uses two-level hierarchy: multiple Processing Units (PUs) and each PU comprises a set of Processing Elements (PEs). To address the irregularity of sparse neural networks, we introduce a novel sparse dataflow for sparse CNN computing as well as weight merging method to balance the computation load on different PUs for better overall efficiency. We implement our design with 32 PUs and 14 PEs in each PU. When compared with DNNWEAVER on VGG16 network, our accelerator achieves 3.49x speedup and 3.05x energy saving on average when running at 150MHz on a Xilinx ZC706 board and reaches the speed of 400 GOPS.
What problem does this paper attempt to address?