ReDCIM: Reconfigurable Digital Computing- in -Memory Processor with Unified FP/INT Pipeline for Cloud AI Acceleration
Fengbin Tu,Yiqi Wang,Zihan Wu,Ling Liang,Yufei Ding,Bongjin Kim,Leibo Liu,Shaojun Wei,Yuan Xie,Shouyi Yin
DOI: https://doi.org/10.1109/jssc.2022.3222059
IF: 5.4
2023-01-01
IEEE Journal of Solid-State Circuits
Abstract:Cloud AI acceleration has drawn great attention in recent years, as big models are becoming a popular trend in deep learning. Cloud AI runs high-efficiency inference, high-accuracy inference and training, in demand of flexible floating-point (FP)/integer (INT) multiply–accumulation (MAC) support. Many computing-in-memory (CIM) processors have been proposed for efficient AI acceleration. They usually rely on analog CIM techniques that are only suitable for high-efficiency neural network (NN) inference with low-precision INT MAC support. Since cloud AI demands high efficiency, high accuracy, and high flexibility simultaneously, we propose an innovative architecture reconfigurable digital CIM (ReDCIM) that meets all three requirements. We design the first CIM-based cloud AI processor, ReDCIM, which constructs a unified FP/INT pipeline architecture based on exponent pre-alignment and reconfigurable in-memory accumulation. Bitwise in-memory Booth multiplication is proposed to reduce computation on CIM. The fabricated ReDCIM chip achieves a state-of-the-art energy efficiency of 29.2 TFLOPS/W at BF16 and 36.5 TOPS/W at INT8.
What problem does this paper attempt to address?