A Heuristic and Greedy Weight Remapping Scheme with Hardware Optimization for Irregular Sparse Neural Networks Implemented on CIM Accelerator in Edge AI Applications

Lizhou Wu,Chenyang Zhao,Jingbo Wang,Xueru Yu,Shoumian Chen,Chen Li,Jun Han,Xiaoyong Xue,Xiaoyang Zeng
DOI: https://doi.org/10.1109/asp-dac58780.2024.10473919
2024-01-01
Abstract:Computing-in-memory (CIM) is a promising technique for hardware acceleration of neural networks (NNs) with high performance and efficiency. However, conventional dense mapping scheme cannot well support the compression and optimization of irregular sparse NNs. In this paper, we propose a heuristic and greedy weight remapping scheme for irregular sparse neural networks implemented on CIM accelerator in edge AI applications. The genetic algorithm (GA) is proposed for the first time to be utilized in the column shuffle for sparse weight remapping. Combined with the granularity exploration of the CIM, the proportion of the compressible all-zero rows increase remarkably. A greedy algorithm is then employed to planarize the unevenly compressed units, thus to improve the storage utilization of the crossbar. For hardware optimization, the pipeline is customized with a zero-skipping circuit to leverage the bit-level activation sparsity at runtime. Our results show that the proposed remapping scheme achieves 70%-94% utilization rate of the sparsity, and an average of 1.3x increment compared with the naive compression. The co-optimized CIM achieves 3-7.6x speedup and 2.1-4.8x energy efficiency, compared with the baseline for dense NNs.
What problem does this paper attempt to address?