Architectural Strategies for the optimization of Physics-Informed Neural Networks

Hemanth Saratchandran,Shin-Fang Chng,Simon Lucey
2024-02-05
Abstract:Physics-informed neural networks (PINNs) offer a promising avenue for tackling both forward and inverse problems in partial differential equations (PDEs) by incorporating deep learning with fundamental physics principles. Despite their remarkable empirical success, PINNs have garnered a reputation for their notorious training challenges across a spectrum of PDEs. In this work, we delve into the intricacies of PINN optimization from a neural architecture perspective. Leveraging the Neural Tangent Kernel (NTK), our study reveals that Gaussian activations surpass several alternate activations when it comes to effectively training PINNs. Building on insights from numerical linear algebra, we introduce a preconditioned neural architecture, showcasing how such tailored architectures enhance the optimization process. Our theoretical findings are substantiated through rigorous validation against established PDEs within the scientific literature.
Machine Learning
What problem does this paper attempt to address?
The paper mainly discusses the training challenges encountered in solving partial differential equations (PDEs) using Physics-Informed Neural Networks (PINNs), especially the instability issue when dealing with high-frequency solutions. The study analyzes the optimization problem of PINNs from the perspective of neural architecture and proposes two main contributions: 1. The research shows that neural networks with Gaussian activation functions perform better in the PINN architecture, attributed to the better lower bound of the minimum eigenvalue of their Neural Tangent Kernel (NTK). This provides a theoretical basis for understanding the advantage of Gaussian activation in PINNs, which is further confirmed by verification experiments. 2. A novel PINN architecture called Equilibrated PINNs is proposed, which utilizes the concept of matrix preconditioning in numerical linear algebra to improve the condition number of the network's weight matrix, thus optimizing the training process and enhancing the efficiency of gradient-based optimizers. Experimental results demonstrate that this new architecture outperforms existing PINN methods on various benchmark PDEs. The paper also compares different activation functions and optimization strategies, such as wavelet activation, sine activation, and preconditioning techniques. It discusses the spectral bias issue in PINN training and how to improve the smoothness of the loss function and training efficiency by adjusting the network architecture. Through experiments, it is proven that Gaussian activation and the preconditioned PINN architecture have advantages in solving complex PDE problems.