3D-aCortex: An Ultra-Compact Energy-Efficient Neurocomputing Platform Based on Commercial 3D-NAND Flash Memories

Mohammad Bavandpour,Shubham Sahay,Mohammad Reza Mahmoodi,Dmitri B. Strukov
DOI: https://doi.org/10.48550/arXiv.1908.02472
2019-08-07
Abstract:The first contribution of this paper is the development of extremely dense, energy-efficient mixed-signal vector-by-matrix-multiplication (VMM) circuits based on the existing 3D-NAND flash memory blocks, without any need for their modification. Such compatibility is achieved using time-domain-encoded VMM design. Our detailed simulations have shown that, for example, the 5-bit VMM of 200-element vectors, using the commercially available 64-layer gate-all-around macaroni-type 3D-NAND memory blocks designed in the 55-nm technology node, may provide an unprecedented area efficiency of 0.14 um2/byte and energy efficiency of ~10 fJ/Op, including the input/output and other peripheral circuitry overheads. Our second major contribution is the development of 3D-aCortex, a multi-purpose neuromorphic inference processor that utilizes the proposed 3D-VMM blocks as its core processing units. We have performed rigorous performance simulations of such a processor on both circuit and system levels, taking into account non-idealities such as drain-induced barrier lowering, capacitive coupling, charge injection, parasitics, process variations, and noise. Our modeling of the 3D-aCortex performing several state-of-the-art neuromorphic-network benchmarks has shown that it may provide the record-breaking storage efficiency of 4.34 MB/mm2, the peak energy efficiency of 70.43 TOps/J, and the computational throughput up to 10.66 TOps/s. The storage efficiency can be further improved seven-fold by aggressively sharing VMM peripheral circuits at the cost of slight decrease in energy efficiency and throughput.
Emerging Technologies,Hardware Architecture,Neural and Evolutionary Computing
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to achieve efficient and high - density vector - by - matrix multiplication (VMM) in neuromorphic computing. Specifically, the authors have developed a mixed - signal VMM circuit based on commercial 3D - NAND flash memory blocks without any modification to these memory blocks. This design aims to improve the area efficiency and energy efficiency of computing while reducing the overhead of peripheral circuits. The main contributions of the paper include: 1. **Developed a high - density, low - power mixed - signal VMM circuit based on 3D - NAND flash memory**: Through the design method of time - domain encoding, compatibility with existing 3D - NAND flash memory technology has been achieved. For example, using 64 - layer 3D - NAND flash memory blocks, an area efficiency of 0.14 μm²/byte and an energy efficiency of about 10 fJ/Op can be achieved at the 55 - nm technology node. 2. **Developed the 3D - aCortex multi - purpose neuromorphic inference processor**: This processor uses the above - mentioned 3D - VMM blocks as the core processing unit and has carried out detailed performance simulations at the circuit and system levels, taking into account non - ideal factors such as drain - induced barrier lowering, capacitive coupling, charge injection, parasitic effects, process variations and noise. The simulation results show that 3D - aCortex can provide a storage efficiency of 4.34 MB/mm², a peak energy efficiency of 70.43 TOps/J and a computational throughput of up to 10.66 TOps/s. Through these contributions, the paper has solved the problems existing in current digital VMM implementations, such as low storage density, low energy efficiency and large data transfer overhead, providing a new and efficient solution for neuromorphic computing.