Long Short-Term Memory Implementation Exploiting Passive RRAM Crossbar Array

Honey Nikam,Siddharth Satyam,Shubham Sahay

DOI: https://doi.org/10.1109/TED.2021.3133197

2021-11-08

Abstract:The ever-increasing demand to extract temporal correlations across sequential data and perform context-based learning in this era of big data has led to the development of long short-term memory (LSTM) networks. Furthermore, there is an urgent need to perform these time-series data-dependent applications including speech/video processing and recognition, language modelling and translation, etc. on compact internet-of-things (IoT) edge devices with limited energy. To this end, in this work, for the first time, we propose an extremely area- and energy-efficient LSTM network implementation exploiting the passive resistive random access memory (RRAM) crossbar array. We developed a hardware-aware LSTM network simulation framework and performed an extensive analysis of the proposed LSTM implementation considering the non-ideal hardware artifacts such as spatial (device-to-device) and temporal variations, non-linearity, noise, etc. utilizing an experimentally calibrated comprehensive phenomenological model for passive RRAM crossbar array. Our results indicate that the proposed passive RRAM crossbar-based LSTM network implementation not only outperforms the prior digital and active 1T-1R crossbar-based LSTM implementations by more than three orders of magnitude in terms of area and two orders of magnitude in terms of training energy for identical network accuracy, but also exhibits robustness against spatial and temporal variations and noise, and a faster convergence rate. Our work may provide the incentive for experimental realization of LSTM networks on passive RRAM crossbar arrays.

Emerging Technologies

What problem does this paper attempt to address?

This paper aims to address the area and energy consumption challenges faced when implementing Long Short - Term Memory (LSTM) networks on Internet of Things (IoT) edge devices. Specifically, the paper proposes a method of implementing LSTM networks using passive Resistive Random - Access Memory (RRAM) cross - bar arrays to overcome the high energy consumption and large inference latency problems in the implementation of LSTM networks on traditional digital platforms such as general - purpose CPUs, GPUs or FPGAs. Through this method, the authors of the paper hope to significantly reduce the area and training energy consumption of LSTM networks while maintaining network accuracy, and improve their robustness to spatial and temporal variations and noise. The key issues mentioned in the paper include: - **High energy consumption and large inference latency**: When traditional LSTM networks are implemented on digital platforms, high energy consumption and large inference latency are caused by frequent data exchanges. - **Area efficiency**: Traditional active 1T - 1R cross - bar arrays have a large area overhead when implementing LSTM networks. - **Robustness**: LSTM networks need to maintain good performance in the face of hardware non - ideal characteristics such as spatial and temporal variations, nonlinearity and noise. To solve these problems, the paper proposes an implementation method of LSTM networks based on passive RRAM cross - bar arrays. This method not only improves the area and training energy consumption by several orders of magnitude compared with the existing digital and active 1T - 1R cross - bar array implementation methods, but also shows stronger robustness and faster convergence speed in the face of hardware non - ideal characteristics. In addition, the paper also explores further increasing the memory density of passive RRAM cross - bar arrays through 3D integration technology to meet the large number of parameters in LSTM networks in practical applications.

Long Short-Term Memory Implementation Exploiting Passive RRAM Crossbar Array

ERA-LSTM: An Efficient ReRAM-Based Architecture for Long Short-Term Memory

Long short-term memory networks in memristor crossbar arrays

Long short-term memory networks in memristor crossbars

Efficient Reinforcement Learning On Passive RRAM Crossbar Array

A Compact and Configurable Long Short-Term Memory Neural Network Hardware Architecture.

A 3.89-Gops/mw Scalable Recurrent Neural Network Processor with Improved Efficiency on Memory and Computation

Accelerating Recurrent Neural Networks: A Memory-Efficient Approach

TIME: A Training-in-Memory Architecture for RRAM-Based Deep Neural Networks

SPARE: Spiking Networks Acceleration Using CMOS ROM-Embedded RAM as an In-Memory-Computation Primitive

E-LSTM: an Efficient Hardware Architecture for Long Short-Term Memory

RRAM-based Analog-Weight Spiking Neural Network Accelerator with In-Situ Learning for IoT Applications

SNrram: an Efficient Sparse Neural Network Computation Architecture Based on Resistive Random-Access Memory.

Low Bit-Width Convolutional Neural Network on RRAM

Technological Exploration of Rram Crossbar Array for Matrix-Vector Multiplication

RNC: Efficient RRAM-aware NAS and Compilation for DNNs on Resource-Constrained Edge Devices

Design of CMOS-memristor Circuits for LSTM architecture

A 3d Multi-Layer Cmos-Rram Accelerator for Neural Network

Twofold Sparsity: Joint Bit- and Network-Level Sparsity for Energy-Efficient Deep Neural Network Using RRAM Based Compute-In-Memory

Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory

ITT-RNA: Imperfection Tolerable Training for RRAM-Crossbar-Based Deep Neural-Network Accelerator