Convolutional neural network based reduced order modeling for multiscale problems

Xuhan Zhang,Lijian Jiang
2024-06-24
Abstract:In this paper, we combine convolutional neural networks (CNNs) with reduced order modeling (ROM) for efficient simulations of multiscale problems. These problems are modeled by partial differential equations with high-dimensional random inputs. The proposed method involves two separate CNNs: Basis CNNs and Coefficient CNNs (Coef CNNs), which correspond to two main parts of ROM. The method is called CNN-based ROM. The former one learns input-specific basis functions from the snapshots of fine-scale solutions. An activation function, inspired by Galerkin projection, is utilized at the output layer to reconstruct fine-scale solutions from the basis functions. Numerical results show that the basis functions learned by the Basis CNNs resemble data, which help to significantly reduce the number of the basis functions. Moreover, CNN-based ROM is less sensitive to data fluctuation caused by numerical errors than traditional ROM. Since the tests of Basis CNNs still need fine-scale stiffness matrix and load vector, it can not be directly applied to nonlinear problems. The Coef CNNs can be applied to nonlinear problems and designed to determine the coefficients for linear combination of basis functions. In addition, two applications of CNN-based ROM are presented, including predicting MsFEM basis functions within oversampling regions and building accurate surrogates for inverse problems.
Computational Engineering, Finance, and Science
What problem does this paper attempt to address?
### Problems the paper attempts to solve This paper aims to solve the efficient simulation of multi - scale problems. Specifically, these problems are usually modeled by partial differential equations (PDEs) with high - dimensional random inputs. Traditional reduced - order modeling (ROM) methods have some limitations when dealing with these multi - scale problems, such as sensitivity to data fluctuations and low online computational efficiency. To solve these problems, the authors propose a reduced - order model method based on convolutional neural networks (CNNs), called CNN - based ROM. ### Main contributions 1. **Combining CNN and ROM**: - Two different CNNs are proposed: Basis CNNs and Coefficient CNNs, which correspond to the two main parts of ROM respectively. - Basis CNNs learn input - specific basis functions from snapshots of fine - scale solutions and reconstruct the fine - scale solutions through activation functions. - Coefficient CNNs are used to determine the coefficients of the linear combination of basis functions and are suitable for nonlinear problems. 2. **Improving online computational efficiency**: - Basis CNNs can significantly reduce the number of basis functions, thereby improving online computational efficiency. - Coefficient CNNs further improve online computational efficiency, especially when dealing with nonlinear problems. 3. **Reducing sensitivity to data fluctuations**: - Compared with traditional ROM, CNN - based ROM is less sensitive to fluctuations caused by numerical errors. 4. **Application examples**: - This method has been verified in two applications: predicting MsFEM basis functions and constructing accurate surrogate models for inverse problems. ### Mathematical models and methods #### Problem setting Consider the multi - scale problem defined on the bounded domain \( S\subseteq\mathbb{R}^2\): \[ \begin{cases} L(u(x);\kappa(x,\xi)) = f(u,x), & x\in S,\\ Du(x) = 0, & x\in\partial S, \end{cases} \] where \( L\) is a nonlinear differential operator containing random input \(\kappa(x,\xi)\), \(\kappa(x,\xi)\) is a random variable defined on the probability space \( P = (\Omega,\mathcal{F},P)\), and \( f(u,x)\) is a nonlinear source term. #### Basis CNNs Basis CNNs learn basis functions \( P_N(K)\) from random input samples \( K\). To maintain the integrity of spatial information, pooling layers are not used, and padding techniques are used to maintain the size of the feature map after the convolution operation. Finally, a 1×1 convolution layer is used to adjust the number of basis functions and fuse the features of different channels. #### Activation functions The Galerkin projection activation function \(\sigma_G\) is used in the output layer to reconstruct the high - fidelity solution \( u_h(K)\): \[ \hat{u}_h(K)=P_N(K)u_N(K) \] where \( u_N(K)\) is obtained by solving the algebraic equation: \[ P_N(K)^T(A_h(K)P_N(K)u_N(K)-F_h(K)) = 0 \] #### Loss function The loss function is defined as follows: \[ \text{Loss}_{\text{RB}}(\theta)=\frac{1}{M}\sum_{i = 1}^M\|u_h(K_i)-\hat{u}_h(K_i)\|^2_{L^2}+\lambda_G(\text{cond}_F(A_N(K_i)))^2 \] where \(\text{cond}_F(A)=\|A\|_F\|A^{-1}\|_F\) is Fro