Abstract:Realizing today's cloud-level artificial intelligence functionalities directly on devices distributed at the edge of the internet calls for edge hardware capable of processing multiple modalities of sensory data (e.g. video, audio) at unprecedented energy-efficiency. AI hardware architectures today cannot meet the demand due to a fundamental "memory wall": data movement between separate compute and memory units consumes large energy and incurs long latency. Resistive random-access memory (RRAM) based compute-in-memory (CIM) architectures promise to bring orders of magnitude energy-efficiency improvement by performing computation directly within memory. However, conventional approaches to CIM hardware design limit its functional flexibility necessary for processing diverse AI workloads, and must overcome hardware imperfections that degrade inference accuracy. Such trade-offs between efficiency, versatility and accuracy cannot be addressed by isolated improvements on any single level of the design. By co-optimizing across all hierarchies of the design from algorithms and architecture to circuits and devices, we present NeuRRAM - the first multimodal edge AI chip using RRAM CIM to simultaneously deliver a high degree of versatility for diverse model architectures, record energy-efficiency $5\times$ - $8\times$ better than prior art across various computational bit-precisions, and inference accuracy comparable to software models with 4-bit weights on all measured standard AI benchmarks including accuracy of 99.0% on MNIST and 85.7% on CIFAR-10 image classification, 84.7% accuracy on Google speech command recognition, and a 70% reduction in image reconstruction error on a Bayesian image recovery task. This work paves a way towards building highly efficient and reconfigurable edge AI hardware platforms for the more demanding and heterogeneous AI applications of the future.

Brain-Inspired Recognition System Based on Multimodal In-Memory Computing Framework for Edge AI

A Brain-Inspired In-Memory Computing System for Neuronal Communication Via Memristive Circuits.

Neural Network Acceleration and Voice Recognition with a Flash-based In-Memory Computing SoC

A Brain-Inspired Hierarchical Interactive In-Memory Computing System and Its Application in Video Sentiment Analysis

Pure-Attention-Based Multifunction Memristive Neuromorphic Circuit and System

Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory

A 3D MCAM Architecture Based on Flash Memory Enabling Binary Neural Network Computing for Edge AI

Full-system-integrated Neuro-Inspired Memristor Chips for Edge Intelligence.

CMN: a co-designed neural architecture search for efficient computing-in-memory-based mixture-of-experts

Dynamic neural network with memristive CIM and CAM for 2D and 3D vision

Semantic memory-based dynamic neural network using memristive ternary CIM and CAM for 2D and 3D vision

Recent Progress on Memristive Convolutional Neural Networks for Edge Intelligence

Emotion recognition based on brain-like multimodal hierarchical perception

Memory-Oriented Design-Space Exploration of Edge-AI Hardware for XR Applications

Memristor-Based Edge Computing of Blaze Block for Image Recognition.

MLFlash-CIM: Embedded Multi-Level NOR-Flash Cell based Computing in Memory Architecture for Edge AI Devices

Edge learning using a fully integrated neuro-inspired memristor chip

BRIEDGE: EEG-Adaptive Edge AI for Multi-Brain to Multi-Robot Interaction

Computing in-memory with cascaded spintronic devices for AI edge

Memory-centric neuromorphic computing for unstructured data processing

Memristor-Based Artificial Chips