ALSCA: A Large-Scale Sparse CNN Accelerator Using Position-First Dataflow and Input Channel Merging Approach

Yishuo Meng,Siwei Xiang,Jianfei Wang,Jia Hou,Chen Yang
DOI: https://doi.org/10.1109/tcsii.2024.3359263
2024-01-01
Abstract:The customization of accelerators for sparse convolutional neural networks (SCNN) has been demonstrated to be a promising approach to enhance the computational efficiency of CNNs. However, current sparse-based works always employ reduced-scale convolutional engines (CEs), resulting in limited performance compared to previous dense-based accelerators. It is found that the deployment of large-scale CEs to sparse-based works encounters three challenges: inadequate utilization of CEs, unstable workload of CEs and complex interconnections between CEs and accumulators. Therefore, in this paper, first, a novel position-first dataflow is proposed to streamline the interconnections between the CEs and accumulators of the architecture. In addition, an input channel merging method is devised to improve the inadequate utilization and unstable workload of CEs. With the assistance of the above two schemes, a sparse-based acceleration architecture with an expanded-scale convolution array is designed and implemented on Xilinx VCU118 platform, achieving a runtime frequency of 300MHz. The comparison results demonstrate that compared with current sparse-based works, our proposed architecture can achieve 1.10×-3.95× speedup on actual performance and 1.10×-2.52× speedup on DSP efficiency, respectively, when VGG16 is applied.
engineering, electrical & electronic
What problem does this paper attempt to address?