Abstract:Deep learning has revolutionized the last decade, being at the forefront of extraordinary advances in a wide range of tasks including computer vision, natural language processing, and reinforcement learning, to name but a few. However, it is well-known that deep models trained via maximum likelihood estimation tend to be overconfident and give poorly-calibrated predictions. Bayesian deep learning attempts to address this by placing priors on the model parameters, which are then combined with a likelihood to perform posterior inference. Unfortunately, for deep models, the true posterior is intractable, forcing the user to resort to approximations. In this thesis, we explore the use of variational inference (VI) as an approximation, as it is unique in simultaneously approximating the posterior and providing a lower bound to the marginal likelihood. If tight enough, this lower bound can be used to optimize hyperparameters and to facilitate model selection. However, this capacity has rarely been used to its full extent for Bayesian neural networks, likely because the approximate posteriors typically used in practice can lack the flexibility to effectively bound the marginal likelihood. We therefore explore three aspects of Bayesian learning for deep models: 1) we ask whether it is necessary to perform inference over as many parameters as possible, or whether it is reasonable to treat many of them as optimizable hyperparameters; 2) we propose a variational posterior that provides a unified view of inference in Bayesian neural networks and deep Gaussian processes; 3) we demonstrate how VI can be improved in certain deep Gaussian process models by analytically removing symmetries from the posterior, and performing inference on Gram matrices instead of features. We hope that our contributions will provide a stepping stone to fully realize the promises of VI in the future.

Variational Bayesian Bow tie Neural Networks with Shrinkage

Sparse Bayesian Neural Networks: Bridging Model and Parameter Uncertainty through Scalable Variational Inference

Variational Inference for Bayesian Neural Networks under Model and Parameter Uncertainty

Towards Improved Variational Inference for Deep Bayesian Models

Variational Inference Failures Under Model Symmetries: Permutation Invariant Posteriors for Bayesian Neural Networks

Variational Bayes Neural Network: Posterior Consistency, Classification Accuracy and Computational Challenges

Deterministic Variational Inference for Robust Bayesian Neural Networks

Spike-and-slab shrinkage priors for structurally sparse Bayesian neural networks

Variational Inference on the Final-Layer Output of Neural Networks

Variational Learning of Bayesian Neural Networks Via Bayesian Dark Knowledge

Posterior and variational inference for deep neural networks with heavy-tailed weights

Adaptive variational Bayes: Optimality, computation and applications

Variational EP with Probabilistic Backpropagation for Bayesian Neural Networks

Variational Bayesian Sparsification for Distillation Compression.

Variational Bayesian Phylogenetic Inference with Semi-implicit Branch Length Distributions

Sparsifying Bayesian neural networks with latent binary variables and normalizing flows

Reconsidering Analytical Variational Bounds for Output Layers of Deep Networks

Variational Inference: Posterior Threshold Improves Network Clustering Accuracy in Sparse Regimes

On permutation symmetries in Bayesian neural network posteriors: a variational perspective

Variational Bayesian Last Layers

A Variational Approach to Bayesian Phylogenetic Inference