Abstract:The emerging resistive random-access memory (RRAM) has been widely applied in accelerating the computing of deep neural networks. However, it is challenging to achieve high-precision computations based on RRAM due to the limits of the resistance level and the interfaces. Low bit-width convolutional neural networks (CNNs) provide promising solutions to introduce low bit-width RRAM devices and low bit-width interfaces in RRAM-based computing system (RCS). While open questions still remain regarding: 1) how to make matrix splitting when a single crossbar is not large enough to hold all parameters of one weight matrix; 2) how to design a pipeline to accelerate the inference based on line buffer structure; and 3) how to reduce the accuracy drop due to the parameter splitting and data quantization. In this paper, we propose an RRAM crossbar-based low bit-width CNN (LB-CNN) accelerator. We make detailed discussion on the system design, including the matrix splitting strategies to enhance the scalability, and the pipelined implementation based on line buffers to accelerate the inference. In addition, we propose a splitting and quantizing while training method to incorporate the actual hardware constraints with the training. In our experiments, low bit-width LeNet-5 on RRAM show much better robustness than multibit models with device variation. The pipeline strategy achieves approximately 6.0x speedup to process each image on ResNet-18. For low-bit VGG-8 on CIFAR-10, the proposed accelerator saves 54.9% of the energy consumption and 48.3% of the area compared with the multibit VGG-8 structure.

Low Bit-Width Convolutional Neural Network on RRAM

Training Low Bitwidth Convolutional Neural Network on RRAM

Binary Convolutional Neural Network on RRAM.

An 8-Bit in Resistive Memory Computing Core with Regulated Passive Neuron and Bitline Weight Mapping

A Configurable Multi-Precision CNN Computing Framework Based on Single Bit RRAM

Convolutional Neural Networks Based on RRAM Devices for Image Recognition and Online Learning Tasks

Switched by input: power efficient structure for RRAM-based convolutional neural network.

SNrram: an Efficient Sparse Neural Network Computation Architecture Based on Resistive Random-Access Memory.

An On-chip Layer-wise Training Method for RRAM Based Computing-in-memory Chips.

RRAM Based Convolutional Neural Networks for High Accuracy Pattern Recognition and Online Learning Tasks

Hardware Implementation Of Rram Based Binarized Neural Networks

An Energy-Efficient Mixed-Bit CNN Accelerator With Column Parallel Readout for ReRAM-Based In-Memory Computing

RRAM-DNN: an RRAM and Model-Compression Empowered All-Weights-On-Chip DNN Accelerator

A RRAM Based Max-Pooling Scheme for Convolutional Neural Network

AEPE: an Area and Power Efficient RRAM Crossbar-Based Accelerator for Deep CNNs

RRAM Based Buffer Design for Energy Efficient CNN Accelerator.

High Area/Energy Efficiency RRAM CNN Accelerator with Pattern-Pruning-Based Weight Mapping Scheme

Neural Network-Hardware Co-design for Scalable RRAM-based BNN Accelerators

An ADC-less RRAM-based Computing-in-Memory Macro with Binary CNN for Efficient Edge AI

An Improved RRAM-Based Binarized Neural Network with High Variation-Tolerated Forward/Backward Propagation Module

A 1T2R1C ReRAM CIM Accelerator with Energy-Efficient Voltage Division and Capacitive Coupling for CNN Acceleration in AI Edge Applications.