Near-optimal learning of Banach-valued, high-dimensional functions via deep neural networks

Ben Adcock,Simone Brugiapaglia,Nick Dexter,Sebastian Moraga
2024-07-17
Abstract:The past decade has seen increasing interest in applying Deep Learning (DL) to Computational Science and Engineering (CSE). Driven by impressive results in applications such as computer vision, Uncertainty Quantification (UQ), genetics, simulations and image processing, DL is increasingly supplanting classical algorithms, and seems poised to revolutionize scientific computing. However, DL is not yet well-understood from the standpoint of numerical analysis. Little is known about the efficiency and reliability of DL from the perspectives of stability, robustness, accuracy, and sample complexity. In particular, approximating solutions to parametric PDEs is an objective of UQ for CSE. Training data for such problems is often scarce and corrupted by errors. Moreover, the target function is a possibly infinite-dimensional smooth function taking values in the PDE solution space, generally an infinite-dimensional Banach space. This paper provides arguments for Deep Neural Network (DNN) approximation of such functions, with both known and unknown parametric dependence, that overcome the curse of dimensionality. We establish practical existence theorems that describe classes of DNNs with dimension-independent architecture size and training procedures based on minimizing the (regularized) $\ell^2$-loss which achieve near-optimal algebraic rates of convergence. These results involve key extensions of compressed sensing for Banach-valued recovery and polynomial emulation with DNNs. When approximating solutions of parametric PDEs, our results account for all sources of error, i.e., sampling, optimization, approximation and physical discretization, and allow for training high-fidelity DNN approximations from coarse-grained sample data. Our theoretical results fall into the category of non-intrusive methods, providing a theoretical alternative to classical methods for high-dimensional approximation.
Numerical Analysis
What problem does this paper attempt to address?
The core problem that this paper attempts to solve is to use deep neural networks (DNN) to approximate high - dimensional or infinite - dimensional functions that take values in Banach spaces and overcome the curse of dimensionality problem. Specifically, the paper focuses on efficiently learning these functions from finite sample data, especially when the parameter dependence is known or unknown. The following is a detailed explanation of the core problem of the paper: ### Research Background and Motivation 1. **Approximation of High - Dimensional or Infinite - Dimensional Functions**: - In computational science and engineering (CSE), many problems involve the approximation of high - dimensional or infinite - dimensional functions. For example, the approximation of the solutions of parameterized partial differential equations (PDE) in uncertainty quantification (UQ). - These functions usually take values in infinite - dimensional Banach spaces, and traditional numerical methods face the challenge of the curse of dimensionality when dealing with such problems. 2. **Data Scarcity and Error**: - In practical applications, obtaining each sample data can be very expensive, so data is often scarce. - Data may also be affected by various errors, such as measurement errors, physical discretization errors, and optimization errors. 3. **Application of Deep Learning**: - Deep learning (DL) has achieved remarkable results in fields such as computer vision, genetics, simulation, and image processing, showing its potential in scientific computing. - However, the efficiency and reliability of DL in numerical analysis are not fully understood, especially in terms of sample complexity. ### Main Contributions of the Paper 1. **Theoretical Existence Theorems**: - The paper establishes several practical existence theorems that describe DNN architectures and their training methods that can overcome the curse of dimensionality. - These theorems consider not only sample complexity but also stability, robustness, and accuracy. 2. **Error Analysis**: - The paper proposes a comprehensive error bound, including approximation error, physical discretization error, sampling error, and optimization error. - Through the analysis of these error bounds, the paper shows the effectiveness of DNN in dealing with high - dimensional or infinite - dimensional Banach - valued functions. 3. **Scope of Application**: - The methods in the paper are applicable to cases where anisotropy is known and unknown. - In addition to Hilbert spaces, the paper also considers the function approximation problem in Banach spaces, which is a less - studied area in previous research. ### Specific Problem Description 1. **High - Dimensional or Infinite - Dimensional Functions**: - Consider the function \( f: U \to V \), where \( U = [- 1,1]^N \) is a centered infinite - dimensional hypercube and \( V \) is a Banach space. - Given sample points \( y_1,\ldots,y_m \sim \text{i.i.d.} \rho \), where \( \rho \) is a uniform distribution or Chebyshev distribution, and the measured values are \( d_i = f(y_i)+n_i \), where \( n_i \) represents the measurement error. 2. **Anisotropy Assumption**: - High - dimensional approximation usually requires some anisotropy assumption, that is, the function has a stronger dependence on certain variables. - The paper considers cases where anisotropy is known and unknown and designs corresponding DNN architectures and training methods respectively. 3. **Error Bound**: - The paper proposes an error bound formula: \[ \| f - \hat{f} \|_{L^2_\rho(U; V)} \lesssim E_{\text{app}}+m^\theta (E_{\text{disc}}+E_{\text{samp}}+E_{\text{opt}}) \] where: - \( E_{\text{app}} \) is the approximation error. - \( E_{\text{disc}} \) is the physical discretization error. - \( E_{\text{samp}} \) is the sampling error. - \( E_{\text{opt}} \) is the optimization error.