Normal approximation of Random Gaussian Neural Networks

Nicola Apollonio,Daniela De Canditiis,Giovanni Franzina,Paola Stolfi,Giovanni Luca Torrisi

2023-09-20

Abstract:In this paper we provide explicit upper bounds on some distances between the (law of the) output of a random Gaussian NN and (the law of) a random Gaussian vector. Our results concern both shallow random Gaussian neural networks with univariate output and fully connected and deep random Gaussian neural networks, with a rather general activation function. The upper bounds show how the widths of the layers, the activation functions and other architecture parameters affect the Gaussian approximation of the ouput. Our techniques, relying on Stein's method and integration by parts formulas for the Gaussian law, yield estimates on distances which are indeed integral probability metrics, and include the total variation and the convex distances. These latter metrics are defined by testing against indicator functions of suitable measurable sets, and so allow for accurate estimates of the probability that the output is localized in some region of the space. Such estimates have a significant interest both from a practitioner's and a theorist's perspective.

Probability,Analysis of PDEs

What problem does this paper attempt to address?

The paper attempts to address the problem of Gaussian approximation for the outputs of Random Gaussian Neural Networks. Specifically, it provides explicit upper bounds on the distance between the outputs of Random Gaussian Neural Networks and random Gaussian vectors, and explores how these bounds are influenced by network layer width, activation functions, and other architectural parameters. The main contributions of the paper include: 1. Providing explicit upper bounds for Gaussian approximation in terms of convex distance and 1-Wasserstein distance for deep Random Gaussian Neural Networks. 2. Addressing these problems even when the activation functions are not Lipschitz continuous. 3. Offering explicit upper bounds for Kolmogorov distance, total variation distance, and 1-Wasserstein distance for univariate outputs of shallow Random Gaussian Neural Networks. 4. Utilizing the Stein method and the integration by parts formula for Gaussian distributions to estimate the probability that the output lies within a certain region of space. These results are significant for both theoretical researchers and practitioners, especially in quantifying the Gaussian behavior of the outputs of random neural networks.

Normal approximation of Random Gaussian Neural Networks

Wide Deep Neural Networks with Gaussian Weights are Very Close to Gaussian Processes

Non-asymptotic approximations of Gaussian neural networks via second-order Poincaré inequalities

Gaussian random field approximation via Stein's method with applications to wide random neural networks

Finite Neural Networks as Mixtures of Gaussian Processes: From Provable Error Bounds to Prior Selection

Quantitative CLTs in Deep Neural Networks

Random ReLU Neural Networks as Non-Gaussian Processes

Approximation Bounds for Norm Constrained Neural Networks with Applications to Regression and GANs

Multi-layer random features and the approximation power of neural networks

Deep quantum neural networks form Gaussian processes

Large and moderate deviations for Gaussian neural networks

Bounds on the Bethe Free Energy for Gaussian Networks

A multivariate normal approximation for the Dirichlet density and some applications

Deep Neural Networks as Gaussian Processes

Gaussian universality for approximately polynomial functions of high-dimensional data

Approximation Properties of Gaussian-binary Restricted Boltzmann Machines and Gaussian-binary Deep Belief Networks

Noncompact uniform universal approximation

Quantitative convergence of trained quantum neural networks to a Gaussian process

A general approximation lower bound in $L^p$ norm, with applications to feed-forward neural networks

Neural networks in non-metric spaces

Numerical Approximation Capacity of Neural Networks with Bounded Parameters: Do Limits Exist, and How Can They Be Measured?