Abstract:We propose a Stein variational gradient descent method to concurrently sparsify, train, and provide uncertainty quantification of a complexly parameterized model such as a neural network. It employs a graph reconciliation and condensation process to reduce complexity and increase similarity in the Stein ensemble of parameterizations. Therefore, the proposed condensed Stein variational gradient (cSVGD) method provides uncertainty quantification on parameters, not just outputs. Furthermore, the parameter reduction speeds up the convergence of the Stein gradient descent as it reduces the combinatorial complexity by aligning and differentiating the sensitivity to parameters. These properties are demonstrated with an illustrative example and an application to a representation problem in solid mechanics.

What problem does this paper attempt to address?

The main problem that this paper attempts to solve is the Uncertainty Quantification (UQ) of parameters in neural networks (especially deep neural networks). At the same time, it proposes a new method to simplify the model structure and improve computational efficiency. Specifically, the authors focus on how to effectively perform uncertainty quantification in high - dimensional parameter spaces to overcome the challenges brought by the curse of dimensionality, and achieve model sparsification and parameter alignment in this process. ### Core Problems of the Paper 1. **Uncertainty Quantification in High - Dimensional Parameter Spaces** - Neural networks usually have a large number of parameters, which makes traditional uncertainty quantification methods (such as MCMC) difficult to handle efficiently. - The authors propose a method based on Stein Variational Gradient Descent (SVGD), called **Condensed Stein Variational Gradient Descent (cSVGD)**, for simultaneously performing parameter sparsification, training, and uncertainty quantification. 2. **Reduction of Model Complexity** - Neural networks are usually over - parameterized, that is, there are redundant parameters. By introducing sparsifying priors, the cSVGD method can reduce unnecessary parameters while maintaining model performance. - This not only improves computational efficiency but also makes the model more interpretable. 3. **Parameter Alignment and Similarity Enhancement** - Since the parameters in neural networks are fungible, different parameter arrangements may lead to the same output. This poses a challenge to uncertainty quantification. - cSVGD makes the parameters between different particles more consistent through the graph reconciliation and condensation processes, thus avoiding false repulsion or lack of repulsion caused by parameter arrangements. ### Overview of Solutions - **Stein Variational Gradient Descent (SVGD)**: An optimization method based on Stein's identity, which can approximate the posterior distribution through a set of particles. - **Sparsifying Priors**: Introduce L0 regularization or other sparsifying priors to reduce unimportant parameters. - **Graph Reconciliation and Condensation**: Represent the neural network as a directed graph and align the parameters of different particles by maximizing parameter similarity. - **Concurrent Sparsification and Uncertainty Quantification**: Through the above methods, cSVGD can perform sparsification and uncertainty quantification simultaneously during the training process. ### Experimental Verification The paper verifies the effectiveness of cSVGD through multiple experiments, including an illustrative example and a hyperelastic material modeling problem in solid mechanics. The experimental results show that cSVGD can not only effectively reduce model parameters but also provide reliable uncertainty estimates, and is superior to traditional methods in computational efficiency. In conclusion, the paper proposes an innovative method aimed at solving the problem of high - dimensional parameter uncertainty quantification in neural networks, and improves the interpretability and computational efficiency of the model through sparsification and parameter alignment.

Condensed Stein Variational Gradient Descent for Uncertainty Quantification of Neural Networks

Uncertainty Quantification of Graph Convolution Neural Network Models of Evolving Processes

Improving the performance of Stein variational inference through extreme sparsification of physically-constrained neural network models

Optimizing Quantized Neural Networks in a Weak Curvature Manifold

Heat Equation Stein Variational Ensemble: Rethinking and Advancing Uncertainty-Aware Soft Sensor Modeling

Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm

Augmented Message Passing Stein Variational Gradient Descent

On the mean-field limit for Stein variational gradient descent: stability and multilevel approximation

Stochastic Sub-Sampled Newton Method with Variance Reduction

Annealed Stein Variational Gradient Descent for Improved Uncertainty Estimation in Full-Waveform Inversion

Stein Variational Evolution Strategies

Stein Variational Inference for Discrete Distributions

Accelerating Convergence of Stein Variational Gradient Descent via Deep Unfolding

Variational Stochastic Gradient Descent for Deep Neural Networks

Bayesian Deep Convolutional Encoder-Decoder Networks for Surrogate Modeling and Uncertainty Quantification

Improved Finite-Particle Convergence Rates for Stein Variational Gradient Descent

DiffHybrid-UQ: Uncertainty Quantification for Differentiable Hybrid Neural Modeling

Variance extrapolation method for neural-network variational Monte Carlo

Resampling Stochastic Gradient Descent Cheaply for Efficient Uncertainty Quantification

Riemannian Stein Variational Gradient Descent for Bayesian Inference.

SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient