Efficient Reinforcement Learning On Passive RRAM Crossbar Array

Arjun Tyagi,Shubham Sahay

2024-07-11

Abstract:The unprecedented growth in the field of machine learning has led to the development of deep neuromorphic networks trained on labelled dataset with capability to mimic or even exceed human capabilities. However, for applications involving continuous decision making in unknown environments, such as rovers for space exploration, robots, unmanned aerial vehicles, etc., explicit supervision and generation of labelled data set is extremely difficult and expensive. Reinforcement learning (RL) allows the agents to take decisions without any (human/external) supervision or training on labelled dataset. However, the conventional implementations of RL on advanced digital CPUs/GPUs incur a significantly large power dissipation owing to their inherent von-Neumann architecture. Although crossbar arrays of emerging non-volatile memories such as resistive (R)RAMs with their innate capability to perform energy-efficient in situ multiply-accumulate operation appear promising for Q-learning-based RL implementations, their limited endurance restricts their application in practical RL systems with overwhelming weight updates. To address this issue and realize the true potential of RRAM-based RL implementations, in this work, for the first time, we perform an algorithm-hardware co-design and propose a novel implementation of Monte Carlo (MC) RL algorithm on passive RRAM crossbar array. We analyse the performance of the proposed MC RL implementation on the classical cart-pole problem and demonstrate that it not only outperforms the prior digital and active 1-Transistor-1-RRAM (1T1R)-based implementations by more than five orders of magnitude in terms of area but is also robust against the spatial and temporal variations and endurance failure of RRAMs.

Emerging Technologies

What problem does this paper attempt to address?

The paper aims to address the following issues: 1. **Reducing energy consumption in Reinforcement Learning (RL) hardware implementations**: Traditional RL algorithms, when implemented on advanced digital CPUs/GPUs, consume a significant amount of energy due to the von Neumann architecture. The paper proposes a Monte Carlo (MC) RL algorithm implementation method based on passive RRAM crossbar arrays to reduce energy consumption. 2. **Improving the hardware friendliness of RL algorithms**: Most previous hardware implementations have focused on neural network-based RL algorithms (such as Deep-Q Learning), which require frequent weight updates, leading to durability issues for storage devices. In contrast, MC learning updates weights only at the end of each "episode," significantly reducing the number of weight updates and thus alleviating the durability burden on storage devices. 3. **Exploring the potential of passive RRAM crossbar arrays**: Compared to active 1T-1R crossbar arrays, passive RRAM crossbar arrays have a smaller area overhead but face issues such as sneak path currents. This paper demonstrates the potential of passive RRAM crossbar arrays in implementing RL algorithms by optimizing the stack design and proving their superior performance in the classic Cart-Pole problem. 4. **Achieving efficient and robust RL systems**: Through algorithm-hardware co-design, the paper proposes a new MC RL algorithm hardware implementation scheme that not only reduces the area by five orders of magnitude compared to previous implementations but also shows strong robustness to spatial and temporal variations and endurance failures of RRAM devices.

Efficient Reinforcement Learning On Passive RRAM Crossbar Array

Circuit Modeling for RRAM-Based Neuromorphic Chip Crossbar Array with and Without Write-Verify Scheme.

Long Short-Term Memory Implementation Exploiting Passive RRAM Crossbar Array

TIME: A Training-in-Memory Architecture for RRAM-Based Deep Neural Networks

Reinforcement learning with analogue memristor arrays

Multiscale Co-Design Analysis of Energy, Latency, Area, and Accuracy of a ReRAM Analog Neural Training Accelerator

Sign backpropagation: An on-chip learning algorithm for analog RRAM neuromorphic computing systems

Memristor Hardware-Friendly Reinforcement Learning

Multi-level, Forming Free, Bulk Switching Trilayer RRAM for Neuromorphic Computing at the Edge

RRAM based learning acceleration.

Neuroinspired unsupervised learning and pruning with subquantum CBRAM arrays

Accurate Program/Verify Schemes of Resistive Switching Memory (RRAM) for In-Memory Neural Network Circuits

Hardware Implementation Of Rram Based Binarized Neural Networks

Intelligent Computing with RRAM

A Neuromorphic Architecture for Reinforcement Learning from Real-Valued Observations

High-Throughput In-Memory Computing for Binary Deep Neural Networks with Monolithically Integrated RRAM and 90nm CMOS

A Compact Model of Analog RRAM With Device and Array Nonideal Effects for Neuromorphic Systems

Circuit Design of RRAM-based Neuromorphic Hardware Systems for Classification and Modified Hebbian Learning

A flexible and fast digital twin for RRAM systems applied for training resilient neural networks

Device and circuit optimization of RRAM for neuromorphic computing

Assessing the Performance of Stateful Logic in 1-Selector-1-RRAM Crossbar Arrays