GAAS: An Efficient Group Associated Architecture and Scheduler Module for Sparse CNN Accelerators

Jingyu Wang,Zhe Yuan,Ruoyang Liu,Xiaoyu Feng,Li Du,Huazhong Yang,Yongpan Liu
DOI: https://doi.org/10.1109/TCAD.2020.2966451
IF: 2.9
2020-01-01
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Abstract:Convolutional neural networks (CNNs) have become powerful algorithms in various tasks. Application-specific integrated circuit (ASIC) has been widely used to accelerate CNN on mobile platforms because of its tremendous energy efficiency and performance. Meanwhile, CNNs have become much sparser with the development of network pruning algorithms. Recent works have employed different methods to improve the energy efficiency and performance of ASIC accelerators by utilizing the sparsity character of CNN. However, some of these methods suffer from tremendous output memory overhead and performance degradation induced by hash collisions. To overcome the aforementioned problem, we propose GAAS: an efficient group associated architecture and scheduler module for sparse CNN accelerators. It achieves smaller output memory overhead and higher performance compared with the state-of-the-art accelerator. Our proposed method GAAS mainly consists of two parts: 1) an n-way group associated architecture to reduce the output memory overhead and 2) a scheduler module to improve the performance. Besides, a load-balancing algorithm is proposed and implemented in the scheduler module to improve the performance by reducing the hash collision rate. To demonstrate the efficiency of GAAS, we implement a 4-way image-principal associated architecture with a 16x16 PE array and the scheduler module on our proposed method. The experimental results on AlexNet, VGG16, ResNet18, and MobileNet show that GAAS can reduce the output memory overhead by 50%, and it can surely improve the performance of them by 1.53x, 1.62x, 1.46x, and 1.55x, respectively.
What problem does this paper attempt to address?