Abstract:The problem of solving partial differential equations (PDEs) can be formulated into a least-squares minimization problem, where neural networks are used to parametrize PDE solutions. A global minimizer corresponds to a neural network that solves the given PDE. In this paper, we show that the gradient descent method can identify a global minimizer of the least-squares optimization for solving second-order linear PDEs with two-layer neural networks under the assumption of over-parametrization. We also analyze the generalization error of the least-squares optimization for second-order linear PDEs and two-layer neural networks, when the right-hand-side function of the PDE is in a Barron-type space and the least-squares optimization is regularized with a Barron-type norm, without the over-parametrization assumption.

What problem does this paper attempt to address?

This paper attempts to solve the optimization and generalization theory problems when using two - layer neural networks to solve partial differential equations (PDEs). Specifically, the authors focus on the following two core issues: 1. **Optimization convergence**: Under what conditions can the gradient descent method converge to the global minimum for solving second - order linear partial differential equations? 2. **Generalization error analysis**: When the right - hand function of the PDE is in the Barron - type space and the least - squares optimization is regularized with the path norm, without the over - parameterization assumption, how large is the gap between the global minimum of the empirical loss and the global minimum of the overall loss? ### Detailed Explanation #### Optimization Convergence The paper shows that under the over - parameterization assumption, the gradient descent method can identify the global minimum of a two - layer neural network for solving second - order linear partial differential equations. Specifically, when the number of parameters in the neural network is large enough, the gradient descent method can converge to the global minimum of the empirical loss at a linear convergence rate. #### Generalization Error Analysis The authors also analyze the gap between the empirical risk and the overall risk when using a two - layer neural network to solve second - order linear partial differential equations. In particular, they prove that the posterior generalization error can be bounded by the path norm, and the prior generalization error can be bounded by the Barron norm. ### Summary of Mathematical Formulas - **Empirical Risk**: \[ R_S(\theta):=\frac{1}{n}\sum_{i = 1}^n\ell(L\phi(x_i;\theta), f(x_i)) \] - **Overall Risk**: \[ R_D(\theta):=\mathbb{E}_{x\sim U(\Omega)}[\ell(L\phi(x;\theta), f(x))] \] - **Gradient Descent Update Rule**: \[ \dot{\theta}=-\nabla_\theta R_S(\theta) \] - **Linear Convergence Rate Theorem**: \[ R_S(\theta(t))\leq\exp\left(-\frac{m\lambda_S t}{n}\right)R_S(\theta_0) \] - **Posterior Generalization Error Bound**: \[ |R_D(\theta)-R_S(\theta)|\leq\frac{(\|\theta\|_P + 1)^2}{\sqrt{n}}\cdot2M^2\left(14d^2\sqrt{2\log(2d)}+\log[\pi(\|\theta\|_P + 1)]+\sqrt{2\log\left(\frac{1}{3\delta}\right)}\right) \] - **Prior Generalization Error Bound**: \[ R_D(\theta_{S,\lambda})\leq\frac{6M^2\|f\|^2_B}{m}+\frac{\|f\|^2_B + 1}{\sqrt{n}}\left(4\lambda+16M^2\right)\left\{\log[\pi(2\|f\|_B + 1)]+14d^2\sqrt{\log(2d)}+\sqrt{\log\left(\frac{2}{3\delta}\right)}\right\} \] Through these theoretical results, the authors provide a solid theoretical foundation for using deep - learning methods to solve partial differential equations and lay the foundation for further research on high - order partial differential equations and applications in other fields.

Two-Layer Neural Networks for Partial Differential Equations: Optimization and Generalization Theory

Two-scale Neural Networks for Partial Differential Equations with Small Parameters

A Priori Estimation of the Approximation, Optimization and Generalization Errors of Random Neural Networks for Solving Partial Differential Equations

Some elliptic second order problems and neural network solutions: Existence and error estimates

Transferable Neural Networks for Partial Differential Equations

Optimally weighted loss functions for solving PDEs with Neural Networks

A Comparative Analysis of Optimization and Generalization Properties of Two-Layer Neural Network and Random Feature Models under Gradient Descent Dynamics

PDE Models for Deep Neural Networks: Learning Theory, Calculus of Variations and Optimal Control

A Natural Primal-Dual Hybrid Gradient Method for Adversarial Neural Network Training on Solving Partial Differential Equations

A Discussion on Solving Partial Differential Equations using Neural Networks

PDE-constrained Models with Neural Network Terms: Optimization and Global Convergence

A Least-Squares-Based Neural Network (LS-Net) for Solving Linear Parametric PDEs

A Comparative Analysis of the Optimization and Generalization Property of Two-layer Neural Network and Random Feature Models Under Gradient Descent Dynamics

Theoretical properties of the global optimizer of two layer neural network

Learning Partial Differential Equations from Data Using Neural Networks

On the approximation of the solution of partial differential equations by artificial neural networks trained by a multilevel Levenberg-Marquardt method

Three ways to solve partial differential equations with neural networks — A review

Neural Generalized Ordinary Differential Equations with Layer-varying Parameters

An Overview on Machine Learning Methods for Partial Differential Equations: from Physics Informed Neural Networks to Deep Operator Learning

Characterizing and Mitigating the Difficulty in Training Physics-informed Artificial Neural Networks under Pointwise Constraints

Convolution-Based Model-Solving Method for Three-Dimensional, Unsteady, Partial Differential Equations