Fine-Tuning DeepONets to Enhance Physics-informed Neural Networks for solving Partial Differential Equations

Sidi Wu
2024-10-18
Abstract:Physics-Informed Neural Networks (PINNs) have emerged as powerful tools for solving partial differential equations (PDEs). However, training PINNs from scratch is often computationally intensive and time-consuming. To address this problem, we propose a parameter-efficient approach that fine-tunes pre-trained DeepONet models within the PINN framework (FTO-PINN), enabling more efficient meshless PDE solving. Specifically, we freeze the weights of the pre-trained DeepONet model and fine-tune the output of the branch net by incorporating a small number of new trainable parameters, which can be quickly determined using least-squares techniques. Additionally, we introduce trunk net expansions and low-rank adaptation strategies to further enhance the performance of FTO-PINN. The effectiveness of our proposed method is demonstrated through a series of numerical experiments across various types of PDEs. FTO-PINN significantly reduces the training time of vanilla PINNs while maintaining comparable accuracy, and outperforms DeepONet, which is pre-trained on general function data, in both fidelity and generalization capabilities.
Numerical Analysis
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is to improve the efficiency and accuracy of solving partial differential equations (PDEs) using physics - informed neural networks (PINNs). Specifically, the paper proposes a parameter - efficient fine - tuning method, namely fine - tuning the pre - trained DeepONet model (FTO - PINN) within the PINN framework to achieve more effective mesh - free PDE solving. This method reduces the computational resources and time required to train PINNs from scratch by freezing the weights of the pre - trained DeepONet model and fine - tuning the output of the branch network by introducing a small number of new trainable parameters. In addition, the paper also introduces the backbone network expansion and low - rank adaptation strategies to further improve the performance of FTO - PINN. ### Main Contributions 1. **Improve Computational Efficiency**: Compared with standard PINNs, FTO - PINN significantly reduces the training time while maintaining comparable or higher accuracy. 2. **Enhance Fidelity and Generalization Ability**: Compared with DeepONet pre - trained on general function data, FTO - PINN performs better in terms of fidelity and generalization ability. 3. **Parameter - Efficient**: FTO - PINN is applicable to various types of PDEs, including linear, nonlinear, and interface problems, and has high parameter efficiency. ### Method Overview - **Fine - Tune Pre - trained DeepONet**: Freeze the weights of the pre - trained model and fine - tune the output of the branch network by introducing a small number of new trainable parameters. - **Quickly Determine Trainable Parameters**: Use the least - squares method to quickly determine the values of trainable parameters instead of using traditional propagation - type optimization methods (such as Adam). - **Backbone Network Expansion**: Further improve the accuracy of FTO - PINN by expanding the backbone network and introducing trainable matrices in the linear layers of the backbone network. - **Low - Rank Adaptation**: Further fine - tune the backbone network in specific tasks to better capture the relevant features of the target PDE. ### Numerical Experiments The paper verifies the effectiveness of FTO - PINN through a series of numerical experiments covering PDEs of linear, nonlinear, and interface problems. The experimental results show that FTO - PINN performs well on multiple types of PDEs, not only improving computational efficiency but also maintaining high accuracy. ### Formula Presentation - **Least - Squares Problem for Linear PDEs**: \[ \min_{\alpha} \text{Loss}(\alpha) = \left\| \rho A \alpha - \rho b \right\|^2_2 \] where, \[ A = \begin{bmatrix} A_1 \\ A_2 \end{bmatrix}, A_1 = \begin{bmatrix} L(t_1(x_d^1)) & \cdots & L(t_I(x_d^1)) \\ \vdots & \ddots & \vdots \\ L(t_1(x_d^{N_D})) & \cdots & L(t_I(x_d^{N_D})) \end{bmatrix}, A_2 = \begin{bmatrix} B(t_1(x_b^1)) & \cdots & B(t_I(x_b^1)) \\ \vdots & \ddots & \vdots \\ B(t_1(x_b^{N_B})) & \cdots & B(t_I(x_b^{N_B})) \end{bmatrix} \] \[ b = \begin{bmatrix} b_1 \\ b_2 \end{bmatrix}, b_1 = \begin{bmatrix} f(x_d^1) \\ \vdots \\ f(x_d^{N_D}) \end{bmatrix}, b_2 = \begin{bmatrix} h(x_b^1) \\ \vdots \\ h(x_b^{N_B}) \end{bmatrix} \]