DASM: Data-Streaming-Based Computing in Nonvolatile Memory Architecture for Embedded System
Liang Chang,Xin Ma,Zhaohao Wang,Youguang Zhang,Yufei Ding,Weisheng Zhao,Yuan Xie
DOI: https://doi.org/10.1109/tvlsi.2019.2912941
2019-01-01
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Abstract:Emerging nonvolatile memories (NVMs), including resistive RAM (RRAM), phase-change memory (PCM), and magnetic RAM (MRAM), have opened up new pathways for Computing-In-Memory (CIM). Those NVM technologies can achieve energy-efficient computational operations with only minor modification of the peripheral circuits. Despite many advantages provided by computational NVMs, parallelism is not sufficiently explored in such CIM designs. To break through this limitation on performance gain, we propose a data-streaming design for the NVM-based CIM (e.g., DASM) by leveraging the underlying parallelism in the hardware. DASM benefits from the massive parallelism of data-streaming computing, reduction in data movement of the CIM, and the nonvolatility of memory arrays. Specifically, data streaming operations can be implemented with CIM bitwise operations in both read-out and write-in procedures. In addition, we use the multilevel power gating for the memory array and connections to further boost the performance. Finally, we study a case of inference process for the quantized deep-neural-network-based on the DASM design. DASM architecture achieves 47.8x, 5.1x, 2.1x speedup compared to the NVIDIA Jetson TK1 embedded GPU board, Intel Xeon E5-2640 CPU, the state-of-the-art field-programmable gate array (FPGA) design, with much lower power consumption.