A Heterogeneous Microprocessor Based on All-Digital Compute-in-Memory for End-to-End AIoT Inference

Songming Yu,Yifan He,Hongyang Jia,Wenyu Sun,Mufeng Zhou,Luchang Lei,Wentao Zhao,Guofu Ma,Huazhong Yang,Yongpan Liu
DOI: https://doi.org/10.1109/TCSII.2023.3249245
2023-01-01
Abstract:Deploying neural network (NN) models on Internet-of-Things (IoT) devices is important to enable artificial intelligence (AI) on the edge realizing AI-of-Things (AIoT). However, high energy consumption and bandwidth requirement of NN models restricts AI applications on battery-limited equipments. Compute-In-Memory (CIM), featured with high energy efficiency, provides new opportunities for the IoT deployment of NN. However, the design of CIM-based full system is still at the early stage, lacking system-level demonstration and vertical optimization for running end-to-end AI applications. In this brief, we demonstrate a low-power heterogeneous microprocessor System-on-Chip (SoC) with an all-digital SRAM CIM accelerator and rich data acquisition interfaces for end-to-end AIoT NN inference. A dedicated reconfigurable dataflow controller for CIM computation greatly lowers bandwidth requirement on the system bus and improves execution efficiency. The all-digital SRAM CIM array embeds NAND-based bit-serial multiplication within the readout sense amplifier balancing the storage density and system-level throughput. Our chip achieves a throughput of 12.8 GOPS, with 10 TOPS/W energy efficiency. Benchmarked by the four tasks in MLPerf Tiny, experimental results show 1.8x to 2.9x inference speedup over a baseline CIM processor.
What problem does this paper attempt to address?