24.5 A Twin-8T SRAM Computation-In-Memory Macro for Multiple-Bit CNN-Based Machine Learning

Xin Si,Jia-Jing Chen,Yung-Ning Tu,Wei-Hsing Huang,Jing-Hong Wang,Yen-Cheng Chiu,Wei-Chen Wei,Ssu-Yen Wu,Xiaoyu Sun,Rui Liu,Shimeng Yu,Ren-Shuo Liu,Chih-Cheng Hsieh,Kea-Tiong Tang,Qiang Li,Meng-Fan Chang
DOI: https://doi.org/10.1109/isscc.2019.8662392
2019-02-01
Abstract:Computation-in-memory (CIM) is a promising avenue to improve the energy efficiency of multiply-and-accumulate (MAC) operations in AI chips. Multi-bit CNNs are required for high-inference accuracy in many applications [1 – 5] . There are challenges and tradeoffs for SRAM-based CIM: (1) tradeoffs between signal margin, cell stability and area overhead; (2) the high-weighted bit process variation dominates the end-result error rate; (3) trade-off between input bandwidth, speed and area. Previous SRAM CIM macros were limited to binary MAC operations for fully connected networks [1] , or they used CIM for multiplication [2] or weight-combination operations [3] with additional large-area near-memory computing (NMC) logic for summation or MAC operations.
What problem does this paper attempt to address?