Improving DNN Fault Tolerance Using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI

Geng Yuan,Zhiheng Liao,Xiaolong Ma,Yuxuan Cai,Zhenglun Kong,Xuan Shen,Jingyan Fu,Zhengang Li,Chengming Zhang,Hongwu Peng,Ning Liu,Ao Ren,Jinhui Wang,Yanzhi Wang
DOI: https://doi.org/10.1109/isqed51717.2021.9424332
2021-01-01
Abstract:Recent research demonstrated the promise of using resistive random access memory (ReRAM) as an emerging technology to perform inherently parallel analog domain in-situ matrix-vector multiplication -- the intensive and key computation in deep neural networks (DNNs). However, hardware failure, such as stuck-at-fault defects, is one of the main concerns that impedes the ReRAM devices to be a feasible solution for real implementations. The existing solutions to address this issue usually require an optimization to be conducted for each individual device, which is impractical for mass-produced products (e.g., IoT devices). In this paper, we rethink the value of weight pruning in ReRAM-based DNN design from the perspective of model fault tolerance. And a differential mapping scheme is proposed to improve the fault tolerance under a high stuck-on fault rate. Our method can tolerate almost an order of magnitude higher failure rate than the traditional two-column method in representative DNN tasks. More importantly, our method does not require extra hardware cost compared to the traditional two-column mapping scheme. The improvement is universal and does not require the optimization process for each individual device.
What problem does this paper attempt to address?