Abstract:Bayesian Neural Networks (BNNs) provide principled estimates of model and data uncertainty by encoding parameters as distributions. This makes them key enablers for reliable AI that can be deployed on safety critical edge systems. These systems can be made resource efficient by restricting synapses to two synaptic states $\{-1,+1\}$ and using a memristive in-memory computing (IMC) paradigm. However, BNNs pose an additional challenge -- they require multiple instantiations for ensembling, consuming extra resources in terms of energy and area. In this work, we propose a novel sparsity-aware optimization for Bayesian Binary Neural Network (BBNN) accelerators that exploits the inherent BBNN sampling sparsity -- most of the network is made up of synapses that have a high probability of being fixed at $\pm1$ and require no sampling. The optimization scheme proposed here exploits the sampling sparsity that exists both among layers, i.e only a few layers of the network contain a majority of the probabilistic synapses, as well as the parameters i.e., a tiny fraction of parameters in these layers require sampling, reducing total sampled parameter count further by up to $86\%$. We demonstrate no loss in accuracy or uncertainty quantification performance for a VGGBinaryConnect network on CIFAR-100 dataset mapped on a custom sparsity-aware phase change memory (PCM) based IMC simulator. We also develop a simple drift compensation technique to demonstrate robustness to drift-induced degradation. Finally, we project latency, energy, and area for sparsity-aware BNN implementation in both pipelined and non-pipelined modes. With sparsity-aware implementation, we estimate upto $5.3 \times$ reduction in area and $8.8\times$ reduction in energy compared to a non-sparsity-aware implementation. Our approach also results in $2.9 \times $ more power efficiency compared to the state-of-the-art BNN accelerator.

Layer adaptive node selection in Bayesian neural networks: Statistical guarantees and implementation details

Spike-and-slab shrinkage priors for structurally sparse Bayesian neural networks

Neuronized Priors for Bayesian Sparse Linear Regression

Sparse Bayesian Neural Networks: Bridging Model and Parameter Uncertainty through Scalable Variational Inference

High Dimensional Bayesian Network Classification with Network Global-Local Shrinkage Priors

Shaving Weights with Occam's Razor: Bayesian Sparsification for Neural Networks Using the Marginal Likelihood

A modeling framework for detecting and leveraging node-level information in Bayesian network inference

Reliable and Efficient Inference of Bayesian Networks from Sparse Data by Statistical Learning Theory

Impact of Parameter Sparsity on Stochastic Gradient MCMC Methods for Bayesian Deep Learning

Restricted Bayesian Neural Network

Efficient Model Compression for Bayesian Neural Networks

Misclassification bounds for PAC-Bayesian sparse deep learning

Deep Network Regularization via Bayesian Inference of Synaptic Connectivity

Sparsity-Aware Optimization of In-Memory Bayesian Binary Neural Network Accelerators

Bayesian sparsification for deep neural networks with Bayesian model reduction

Rethinking Bayesian Learning for Data Analysis: The Art of Prior and Inference in Sparsity-Aware Modeling

Variational Bayes Neural Network: Posterior Consistency, Classification Accuracy and Computational Challenges

From Bayesian Sparsity to Gated Recurrent Nets

Layer-wise synapse optimization for implementing neural networks on general neuromorphic architectures

Bayesian graph selection consistency under model misspecification

Sparse neural network regression with variable selection