Use of Multifidelity Training Data and Transfer Learning for Efficient Construction of Subsurface Flow Surrogate Models

Su Jiang,Louis J. Durlofsky

DOI: https://doi.org/10.1016/j.jcp.2022.111800

2022-04-24

Abstract:Data assimilation presents computational challenges because many high-fidelity models must be simulated. Various deep-learning-based surrogate modeling techniques have been developed to reduce the simulation costs associated with these applications. However, to construct data-driven surrogate models, several thousand high-fidelity simulation runs may be required to provide training samples, and these computations can make training prohibitively expensive. To address this issue, in this work we present a framework where most of the training simulations are performed on coarsened geomodels. These models are constructed using a flow-based upscaling method. The framework entails the use of a transfer-learning procedure, incorporated within an existing recurrent residual U-Net architecture, in which network training is accomplished in three steps. In the first step. where the bulk of the training is performed, only low-fidelity simulation results are used. The second and third steps, in which the output layer is trained and the overall network is fine-tuned, require a relatively small number of high-fidelity simulations. Here we use 2500 low-fidelity runs and 200 high-fidelity runs, which leads to about a 90% reduction in training simulation costs. The method is applied for two-phase subsurface flow in 3D channelized systems, with flow driven by wells. The surrogate model trained with multifidelity data is shown to be nearly as accurate as a reference surrogate trained with only high-fidelity data in predicting dynamic pressure and saturation fields in new geomodels. Importantly, the network provides results that are significantly more accurate than the low-fidelity simulations used for most of the training. The multifidelity surrogate is also applied for history matching using an ensemble-based procedure, where accuracy relative to reference results is again demonstrated.

Machine Learning,Geophysics

What problem does this paper attempt to address?

The problem that this paper attempts to solve is: in underground fluid flow simulation, data assimilation/history matching faces huge computational challenges because a large number of high - fidelity model simulations are required. To reduce the simulation cost in these applications, a variety of deep - learning - based surrogate model techniques have been developed. However, constructing data - driven surrogate models usually requires thousands of high - fidelity simulation runs to provide training samples, which makes the training cost very high. To solve this problem, this paper proposes a framework in which most of the training simulations are carried out on coarsened low - fidelity geological models. These models are constructed by a flow - based upscaling method. This framework combines a transfer learning procedure and realizes network training in the existing recurrent residual U - Net architecture. Specifically, the training process is divided into three steps: 1. **Step 1**: The main training phase, using only low - fidelity simulation results. 2. **Step 2**: Training the output layer, using a small number of high - fidelity simulation results. 3. **Step 3**: Fine - tuning the entire network, further using a small number of high - fidelity simulation results. By this method, the authors can significantly reduce the number of high - fidelity simulations required for training while maintaining high accuracy, thereby reducing the computational cost. Specifically, using 2,500 low - fidelity runs and 200 high - fidelity runs can achieve about 90% reduction in training simulation cost. In addition, this method is applied to two - phase underground fluid flow in a three - dimensional channelized system and demonstrates its accuracy in predicting dynamic pressure and saturation fields in new geological models. Compared with the reference surrogate model trained only with high - fidelity data, the multi - fidelity data - trained surrogate model is almost equally accurate. More importantly, the results provided by the network are more accurate than the low - fidelity simulation results used for most of the training. Finally, this multi - fidelity surrogate model is also applied to ensemble - based history matching and demonstrates its accuracy relative to the reference results. ### Formula Summary 1. **Darcy Velocity Formula**: \[ u_j = -\frac{k k_{rj}(S_j)}{\mu_j(p_j)} (\nabla p_j - \rho_j g \nabla z), \quad j = o, w \] where \( k \) is the absolute permeability tensor, \( k_{rj} \) is the relative permeability, \( \mu_j \) is the viscosity, \( p_j \) is the pressure, \( g \) is the gravitational acceleration, and \( z \) is the depth. 2. **Well Flow Rate Formula**: \[ (q_w_j)_i = W I_i \left( \frac{k_{rj} \rho_j}{\mu_j} \right)_i (p_i - p_{w,i}) \] where \( W I_i \) is the well index, defined as: \[ W I_i = \frac{2 \pi k_i \Delta z}{\ln \left( \frac{r_0}{r_w} \right)} \] 3. **Loss Function**: \[ \theta^* = \arg \min_{\theta} \left[ \frac{1}{nsmp} \frac{1}{nt} \sum_{i=1}^{nsmp} \sum_{t=1}^{nt} \| \hat{x}_{h,t}^i - x_{h,t}^i \|_2^2 + \lambda_w \frac{1}{nsmp} \frac{1}{nt} \frac{1}{nw} \sum_{i=1}^{nsmp} \sum_{t=1}^{nt} \sum_{w=1}^{nw} \| \hat{x}

Use of Multifidelity Training Data and Transfer Learning for Efficient Construction of Subsurface Flow Surrogate Models

Adaptive Multifidelity Data Assimilation for Nonlinear Subsurface Flow Problems

A deep-learning-based surrogate model for data assimilation in dynamic subsurface flow problems

Deep-learning-based surrogate flow modeling and geological parameterization for data assimilation in 3D subsurface flow

A Surrogate Model for the Variable Infiltration Capacity Model Using Deep Learning Artificial Neural Network

1D Flow Simulation with Irregular Cross-Sections Using the Pre-Balanced Shallow Water Equations

Accelerated training of deep learning surrogate models for surface displacement and flow, with application to MCMC-based history matching of CO2 storage operations

Assessment of Multifidelity Surrogate Approaches for Expedient Loads Prediction in High-Speed Flows

Accelerating Groundwater Data Assimilation with a Gradient‐Free Active Subspace Method

Surrogate Modeling for Fluid Flows Based on Physics-Constrained Deep Learning Without Simulation Data

Multi-fidelity Surrogate Modelling of Wall Mounted Cubes

Adaptive Multi-Fidelity Probabilistic Collocation-Based Kalman Filter for Subsurface Flow Data Assimilation: Numerical Modeling and Real-World Experiment

Upskilling low‐fidelity hydrodynamic models of flood inundation through spatial analysis and Gaussian Process learning

Inverse modeling for subsurface flow based on deep learning surrogates and active learning strategies

Machine learning surrogates for efficient hydrologic modeling: Insights from stochastic simulations of managed aquifer recharge

Transfer Learning on Multi-Dimensional Data: A Novel Approach to Neural Network-Based Surrogate Modeling

A Multifidelity Machine Learning Based Semi-Lagrangian Finite Volume Scheme for Linear Transport Equations and the Nonlinear Vlasov–Poisson System

Deep learning surrogate for predicting hydraulic conductivity tensors from stochastic discrete fracture-matrix models

Multi-fidelity reduced-order surrogate modelling

Multi-fidelity prediction of fluid flow and temperature field based on transfer learning using Fourier Neural Operator

Deep transfer learning for groundwater flow in heterogeneous aquifers using a simple analytical model