Regularity-Conforming Neural Networks (ReCoNNs) for solving Partial Differential Equations

Jamie M. Taylor,David Pardo,Judit Muñoz-Matute
2024-05-23
Abstract:Whilst the Universal Approximation Theorem guarantees the existence of approximations to Sobolev functions -- the natural function spaces for PDEs -- by Neural Networks (NNs) of sufficient size, low-regularity solutions may lead to poor approximations in practice. For example, classical fully-connected feed-forward NNs fail to approximate continuous functions whose gradient is discontinuous when employing strong formulations like in Physics Informed Neural Networks (PINNs). In this article, we propose the use of regularity-conforming neural networks, where a priori information on the regularity of solutions to PDEs can be employed to construct proper architectures. We illustrate the potential of such architectures via a two-dimensional (2D) transmission problem, where the solution may admit discontinuities in the gradient across interfaces, as well as power-like singularities at certain points. In particular, we formulate the weak transmission problem in a PINNs-like strong formulation with interface and continuity conditions. Such architectures are partially explainable; discontinuities are explicitly described, allowing the introduction of novel terms into the loss function. We demonstrate via several model problems in one and two dimensions the advantages of using regularity-conforming architectures in contrast to classical architectures. The ideas presented in this article easily extend to problems in higher dimensions.
Numerical Analysis
What problem does this paper attempt to address?
The paper primarily addresses the issues encountered when using neural networks (especially Physics-Informed Neural Networks, PINNs) to solve Partial Differential Equations (PDEs), particularly when the solutions of PDEs exhibit low regularity. Traditional Fully Connected Feedforward Neural Networks (FCNNs) may not approximate such solutions well. The paper proposes a new network architecture—Regularity-Conforming Neural Networks (ReCoNNs)—to tackle this challenge. ### Main Problems Addressed by the Paper 1. **Approximation of Low Regularity Solutions**: - When the solutions of PDEs have discontinuous derivatives or singular behaviors at certain points or surfaces, traditional neural network architectures (such as FCNNs using ReLU or tanh activation functions) may not effectively approximate these solutions. - Such solutions typically appear in transmission problems with discontinuous material interfaces, where the solution may exhibit gradient jumps at the interfaces and power-law singularities at certain special points. 2. **Optimization Issues**: - During training, if the loss function includes the first-order derivatives of the solution, using ReLU as the activation function may lead to numerical instability because the second-order derivative of ReLU is a δ function at $x=0$, making it difficult to accurately estimate the gradient of the loss function. - For traditional neural networks with smooth activation functions (like tanh), although they can theoretically approximate low regularity solutions, in practice, they may exhibit the Gibbs phenomenon, leading to high-frequency oscillations near singularities, which affects the approximation quality. ### Solutions 1. **ReCoNNs Architecture Design**: - Utilize prior knowledge about the regularity and singular structure of the solution to design an appropriate network architecture. - For problems with interface discontinuous derivatives, combinations of absolute value functions can be added to the network output to describe gradient jumps. - For problems with power-law singularities, specific functions describing such singularities, like $r^\lambda \sin(\lambda (\theta-\omega))$, where $\lambda$ and $\omega$ depend on the geometry, can be added to the output. 2. **Partial Interpretability**: - ReCoNNs can not only approximate singular solutions but also explicitly describe these singular behaviors through the network output. - For example, for problems with interface discontinuous derivatives, the network parameters can directly calculate the difference in derivatives on both sides; for power-law singularities, the network can approximate the singular intensity factor. 3. **Strong Extensibility**: - The proposed method is easily extendable to higher-dimensional problems. In summary, the paper addresses the issues encountered when using traditional neural networks to solve low regularity PDEs by introducing the ReCoNNs architecture and demonstrates the superiority of this architecture in approximating solutions to such problems.