Abstract:We propose derivative-informed neural operators (DINOs), a general family of neural networks to approximate operators as infinite-dimensional mappings from input function spaces to output function spaces or quantities of interest. After discretizations both inputs and outputs are high-dimensional. We aim to approximate not only the operators with improved accuracy but also their derivatives (Jacobians) with respect to the input function-valued parameter to empower derivative-based algorithms in many applications, e.g., Bayesian inverse problems, optimization under parameter uncertainty, and optimal experimental design. The major difficulties include the computational cost of generating derivative training data and the high dimensionality of the problem leading to large training cost. To address these challenges, we exploit the intrinsic low-dimensionality of the derivatives and develop algorithms for compressing derivative information and efficiently imposing it in neural operator training yielding derivative-informed neural operators. We demonstrate that these advances can significantly reduce the costs of both data generation and training for large classes of problems (e.g., nonlinear steady state parametric PDE maps), making the costs marginal or comparable to the costs without using derivatives, and in particular independent of the discretization dimension of the input and output functions. Moreover, we show that the proposed DINO achieves significantly higher accuracy than neural operators trained without derivative information, for both function approximation and derivative approximation (e.g., Gauss-Newton Hessian), especially when the training data are limited.

Nonuniform random feature models using derivative information

Optimizing Quantized Neural Networks in a Weak Curvature Manifold

Universal approximation property of Banach space-valued random feature models including random neural networks

Sparse deep neural networks for nonparametric estimation in high-dimensional sparse regression

Deep Neural Networks are Adaptive to Function Regularity and Data Distribution in Approximation and Estimation

A Comparative Analysis of Optimization and Generalization Properties of Two-Layer Neural Network and Random Feature Models under Gradient Descent Dynamics

Optimal Rates of Approximation by Shallow ReLU Neural Networks and Applications to Nonparametric Regression

Derivative-Informed Neural Operator: An Efficient Framework for High-Dimensional Parametric Derivative Learning

Function Approximation with Randomly Initialized Neural Networks for Approximate Model Reference Adaptive Control

Nonparametric regression using over-parameterized shallow ReLU neural networks

The Geometric Occam's Razor Implicit in Deep Learning

Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks

Improving the Noise Estimation of Latent Neural Stochastic Differential Equations

Data Driven Threshold and Potential Initialization for Spiking Neural Networks.

A Comparative Analysis of the Optimization and Generalization Property of Two-layer Neural Network and Random Feature Models Under Gradient Descent Dynamics

Function and derivative approximation by shallow neural networks

Data-driven Derivation of Partial Differential Equations Using Neural Network Model.

Nonparametric Estimation via Partial Derivatives

Random ReLU Neural Networks as Non-Gaussian Processes

Robust Training and Initialization of Deep Neural Networks: An Adaptive Basis Viewpoint