Abstract:The paper proposes a deep learning method specifically dealing with the forward and inverse problem of variable coefficient partial differential equations -- Variable Coefficient Physical Information Neural Network (VC-PINN). The shortcut connections (ResNet structure) introduced into the network alleviates the "Vanishing gradient" and unifies the linear and nonlinear coefficients. The developed method was applied to four equations including the variable coefficient Sine-Gordon (vSG), the generalized variable coefficient Kadomtsev-Petviashvili equation (gvKP), the variable coefficient Korteweg-de Vries equation (vKdV), the variable coefficient Sawada-Kotera equation (vSK). Numerical results show that VC-PINN is successful in the case of high dimensionality, various variable coefficients (polynomials, trigonometric functions, fractions, oscillation attenuation coefficients), and the coexistence of multiple variable coefficients. We also conducted an in-depth analysis of VC-PINN in a combination of theory and numerical experiments, including four aspects, the necessity of ResNet, the relationship between the convexity of variable coefficients and learning, anti-noise analysis, the unity of forward and inverse problems/relationship with standard PINN.
What problem does this paper attempt to address?
### Problems the paper attempts to solve
This paper aims to propose a deep - learning method for specifically dealing with forward and inverse problems of partial differential equations (PDEs) with variable coefficients - the Variable Coefficient Physical Information Neural Network (VC - PINN). Specifically, the paper attempts to solve the following problems:
1. **Handling of PDEs with variable coefficients**:
- The standard Physics - Informed Neural Network (PINN) has difficulties in dealing with PDEs with variable coefficients, especially when the dimension of the independent variables is different from that of the equations. For example, in (1 + 1)-dimensional equations, variable coefficients related only to the time variable \(t\) are common, but the standard PINN cannot effectively handle this situation.
- The paper proposes a new framework, VC - PINN. By adding a branch network to approximate the variable coefficients and introducing the Residual Network (ResNet) structure to unify linear and nonlinear coefficients, this problem is solved.
2. **Handling of high - dimensionality and multiple variable coefficients**:
- The paper tests the performance of VC - PINN in the co - existence of high - dimensionality and multiple variable coefficients (polynomial, trigonometric, fractional, oscillatory - decay coefficients), and verifies its effectiveness under these complex conditions.
3. **Combination of theory and numerical experiments**:
- The paper conducts an in - depth analysis of VC - PINN from four aspects: the necessity of the ResNet structure, the relationship between the convexity of variable coefficients and learning, anti - noise analysis, and the unification of forward and inverse problems/relationship with the standard PINN.
- By combining theoretical derivation and numerical experiments, the effectiveness and robustness of VC - PINN are verified.
4. **Challenges in practical applications**:
- In engineering applications, fully knowing the expressions of variable coefficients is a strict requirement. Therefore, the paper conducts a discussion based on forward and inverse problems in the discrete sense, and explores how to use VC - PINN for solving when the variable coefficients are partially known or unknown.
### Mathematical formulas
The mathematical formulas involved in the paper mainly include the forms of partial differential equations and the definition of loss functions. The following are the key formulas:
1. **Form of PDE with variable coefficients**:
\[
u_t = N[u]\cdot C[t]^T, \quad x\in\Omega, \quad t\in [T_0, T_1]
\]
where \(u = u(x, t)\) is the real - valued solution of the equation, \(\Omega\) is a subset of \(\mathbb{R}^N\), the spatial vector \(x=(x_1, x_2,\ldots, x_N)\), \(N[\cdot]\) is an operator vector, and each component \(N_i\) is an operator, usually including but not limited to linear or nonlinear differential operators. \(C[t]=(c_1(t), c_2(t),\ldots)\) is a coefficient vector, and its component \(c_i(t)\) is an analytic function of the time variable \(t\).
2. **Loss function**:
\[
\text{Loss}(\theta)=\text{Loss}_I(\theta)+\text{Loss}_b(\theta)+\text{Loss}_f(\theta)+\text{Loss}_c(\theta)
\]
where:
- Initial - value - constraint loss:
\[
\text{Loss}_I(\theta)=\frac{1}{n_I}\sum_{i = 1}^{n_I}|\tilde{u}(x_i^I, T_0; \theta_u)-g_0(x_i^I)|^2
\]
- Boundary - constraint loss:
\[
\text{Loss}_b(\theta)=\frac{1}{n_b}\sum_{i = 1}^{n_b}