A 6.54-to-26.03 TOPS/W Computing-In-Memory RNN Processor Using Input Similarity Optimization and Attention-based Context-breaking with Output Speculation

Ruiqi Guo,Hao Li,Ruhui Liu,Zhixiao Zhang,Limei Tang,Hao Sun,Leibo Liu,Meng-Fan Chang,Shaojun Wei,Shouyi Yin
DOI: https://doi.org/10.23919/vlsicircuits52068.2021.9492492
2021-01-01
Abstract:This work presents a 65nm RNN processor with computing-in-memory (CIM) macros. The main contributions include: 1) A similarity analyzer (SimAyz) to fully leverage the temporal stability of input sequences with 1.52× performance speedup; 2) An attention-based context-breaking (AttenBrk) method with output speculation to reduce off-chip data accesses up to 30.3%; 3) A double-buffering scheme for CIM macros to hide writing latency and a pipeline processing element (PE) array to increase the system throughput. Measured results show that this chip achieves 6.54-to-26.03 TOPS/W energy efficiency vary from various LSTM benchmarks.
What problem does this paper attempt to address?