Mitigating Asymmetric Nonlinear Weight Update Effects in Hardware Neural Network based on Analog Resistive Synapse

Chih-Cheng Chang,Pin-Chun Chen,Teyuh Chou,I-Ting Wang,Boris Hudec,Che-Chia Chang,Chia-Ming Tsai,Tian-Sheuan Chang,Tuo-Hung Hou
DOI: https://doi.org/10.1109/JETCAS.2017.2771529
2017-12-16
Abstract:Asymmetric nonlinear weight update is considered as one of the major obstacles for realizing hardware neural networks based on analog resistive synapses because it significantly compromises the online training capability. This paper provides new solutions to this critical issue through co-optimization with the hardware-applicable deep-learning algorithms. New insights on engineering activation functions and a threshold weight update scheme effectively suppress the undesirable training noise induced by inaccurate weight update. We successfully trained a two-layer perceptron network online and improved the classification accuracy of MNIST handwritten digit dataset to 87.8/94.8% by using 6-bit/8-bit analog synapses, respectively, with extremely high asymmetric nonlinearity.
Machine Learning,Emerging Technologies,Neural and Evolutionary Computing
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is **the serious impact of asymmetric nonlinear weight update on the online training ability of hardware neural networks (HNNs) based on analog - resistance synapses**. Specifically, this problem significantly weakens the online training performance of hardware neural networks, leading to a decline in learning accuracy. ### Main problems and solutions 1. **Problem description**: - **Asymmetric nonlinear weight update**: RRAM (resistive random - access memory) synapses exhibit asymmetric nonlinear behavior during weight update, that is, the potentiation and depression characteristics are inconsistent. This asymmetry will lead to large errors during weight update, thus affecting the learning effect of the model. - **Training noise**: Due to the inaccuracy of weight update, unnecessary noise will be introduced during the training process, further affecting the convergence and accuracy of the model. 2. **Solutions**: - **Co - optimizing deep - learning algorithms suitable for hardware**: Through the co - optimization of hardware and algorithms, new activation function designs and threshold weight update schemes are proposed to effectively suppress the training noise caused by inaccurate weight updates. - **Engineered activation functions**: Introduce the ReLU activation function and its variants (such as the shifted sigmoid function). These activation functions can increase the sparsity of the hidden layer and reduce the noise interference in weight updates. - **Threshold weight update scheme**: By introducing a threshold function, extremely high sparsity is introduced in the back - propagation path, thereby suppressing the noise in weight updates without affecting the accuracy of forward propagation. ### Experimental results - Through the above methods, the researchers successfully trained a two - layer perceptron network online and increased the classification accuracy of the MNIST handwritten digit dataset to 87.8% (using 6 - bit analog synapses) and 94.8% (using 8 - bit analog synapses), and can maintain high accuracy even in the presence of extremely high asymmetric nonlinearity. ### Summary This paper effectively solves the asymmetric nonlinear weight update problem encountered during the online training of RRAM synapses in hardware neural networks through the co - optimization of hardware and algorithms, significantly improving the training performance and classification accuracy of the model.