Abstract:The increasing scale of neural networks needed to support more complex applications has led to an increasing requirement for area- and energy-efficient hardware. One route to meeting the budget for these applications is to circumvent the von Neumann bottleneck by performing computation in or near memory. An inevitability of transferring neural networks onto hardware is that non-idealities such as device-to-device variations or poor device yield impact performance. Methods such as hardware-aware training, where substrate non-idealities are incorporated during network training, are one way to recover performance at the cost of solution generality. In this work, we demonstrate inference on hardware neural networks consisting of 20,000 magnetic tunnel junction arrays integrated on a complementary metal-oxide-semiconductor chips that closely resembles market-ready spin transfer-torque magnetoresistive random access memory technology. Using 36 dies, each containing a crossbar array with its own non-idealities, we show that even a small number of defects in physically mapped networks significantly degrades the performance of networks trained without defects and show that, at the cost of generality, hardware-aware training accounting for specific defects on each die can recover to comparable performance with ideal networks. We then demonstrate a robust training method that extends hardware-aware training to statistics-aware training, producing network weights that perform well on most defective dies regardless of their specific defect locations. When evaluated on the 36 physical dies, statistics-aware trained solutions can achieve a mean misclassification error on the MNIST dataset that differs from the software-baseline by only 2 %. This statistics-aware training method could be generalized to networks with many layers that are mapped to hardware suited for industry-ready applications.

Quantization of Deep Neural Networks to facilitate self-correction of weights on Phase Change Memory-based analog hardware

Aging Aware Retraining for Memristor-based Neuromorphic Computing

Bayesian Inference Based Robust Computing on Memristor Crossbar

Improving the accuracy of neural networks in analog computing-in-memory systems by analog weight.

Improving the Accuracy of Neural Networks in Analog Computing-in-memory Systems by a Generalized Quantization Method

Mitigating Asymmetric Nonlinear Weight Update Effects in Hardware Neural Network based on Analog Resistive Synapse

On the Accuracy of Analog Neural Network Inference Accelerators

QuantBayes: Weight Optimization for Memristive Neural Networks via Quantization-Aware Bayesian Inference

Equivalent-accuracy accelerated neural-network training using analogue memory

Hardware-aware Training Techniques for Improving Robustness of Ex-Situ Neural Network Transfer onto Passive TiO2 ReRAM Crossbars

Hardware-Centric AutoML for Mixed-Precision Quantization

Quantized Magnetic Domain Wall Synapse for Efficient Deep Neural Networks

Designing Efficient Shortcut Architecture for Improving the Accuracy of Fully Quantized Neural Networks Accelerator.

Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators

Enhanced regularization for on-chip training using analog and temporary memory weights

A Closer Look at Hardware-Friendly Weight Quantization

AnalogNAS: A Neural Network Design Framework for Accurate Inference with Analog In-Memory Computing

Energy Efficient Hardware Acceleration of Neural Networks with Power-of-Two Quantisation

On-Chip Hardware-Aware Quantization for Mixed Precision Neural Networks

Measurement-driven neural-network training for integrated magnetic tunnel junction arrays

131TOPS/W 8b ACIM Exploiting Weight-Embedded Auto-Accumulation and Supporting Symmetric Quantization Networks