A 28nm 314.6TLFOPS/W Reconfigurable Floating-Point Analog Compute-In-Memory Macro with Exponent Approximation and Two-Stage Sharing TD-ADC

Pengyu He,Yuanzhe Zhao,Heng Xie,Yang Wang,Shouyi Yin,Li,Yan Zhu,R. P. Martins,Chi-Hang Chan,Minglei Zhang
DOI: https://doi.org/10.1109/cicc60959.2024.10529073
2024-01-01
Abstract:SRAM-based compute-in-memory (CIM) [1]–[7] exhibits outstanding energy efficiency in fast-growing AI applications. The analog CIM [4]–[5] improves the energy efficiency further by reducing the MAC signal swing. Floating-point (FP) inference shows higher accuracy than the INT type from its exponential expansion even if keeping the same bit width, and it also demonstrates better energy efficiency than the INT type in the CIM configuration, which is a promising technique for complex AI tasks. However, extra power from alignment operations for the exponential calculation together with the considerable MAC power in FP-CIMs [7] still contributes to the performance bottleneck. Considering the energy efficiency advantages of the analog CIMs, implementing a fully analog FP-CIM is promising; however, it faces challenges: 1) exponential preprocessing is often implemented with digital circuits, while lacking analog circuits to support the exponent calculation; 2) exponents are hard to use the bit-wise splitting calculation in the FP-CIMs, making it challenging to support multiple FP formats with finite hardware; 3) ADCs often dominates the power consumption in the analog CIMs, while their poor parallelization with MACs leads to throughput reduction.
What problem does this paper attempt to address?