Enabling High-Quality Uncertainty Quantification in a PIM Designed for Bayesian Neural Network

Xingchen Li,Bingzhe Wu,Guangyu Sun,Zhe Zhang,Zhihang Yuan,Runsheng Wang,Ru Huang,Dimin Niu,Hongzhong Zheng,Zhichao Lu,Liang Zhao,Meng-Fan Marvin Chang,Tianchan Guan,Xin Si
DOI: https://doi.org/10.1109/HPCA53966.2022.00080
2022-01-01
Abstract:Uncertainty quantification measures the prediction uncertainty of a neural network facing out-of-training-distribution samples. Bayesian Neural Networks (BNNs) can provide high-quality uncertainty quantification by introducing specific noise to the weights during inference. To accelerate BNN inference, ReRAM processing-in-memory (PIM) architecture is a competitive solution to provide both high-efficient computing and in-situ noise generation at the same time. However, there normally exists a huge gap between the generated noise in PIM hardware and that required by a BNN model. We demonstrate that the quality of uncertainty quantification is substantially degraded due to this gap. To solve this problem, we propose a holistic framework called W2W-PIM. We first introduce an efficient method to generate noise in ReRAM PIM design according to the demand of a BNN model. In addition, the PIM architecture is carefully modified to enable the noise generation and evaluate uncertainty quality. Moreover, a calibration unit is further introduced to reduce the noise gap caused by imperfection of the noise model. Comprehensive evaluation results demonstrate that W2W-PIM framework can achieve high-quality uncertainty quantification and high energy-efficiency at the same time.
What problem does this paper attempt to address?