Machine learning surrogates for efficient hydrologic modeling: Insights from stochastic simulations of managed aquifer recharge

Timothy Dai,Kate Maher,Zach Perzan

2024-07-30

Abstract:Process-based hydrologic models are invaluable tools for understanding the terrestrial water cycle and addressing modern water resources problems. However, many hydrologic models are computationally expensive and, depending on the resolution and scale, simulations can take on the order of hours to days to complete. While techniques such as uncertainty quantification and optimization have become valuable tools for supporting management decisions, these analyses typically require hundreds of model simulations, which are too computationally expensive to perform with a process-based hydrologic model. To address this gap, we propose a hybrid modeling workflow in which a process-based model is used to generate an initial set of simulations and a machine learning (ML) surrogate model is then trained to perform the remaining simulations required for downstream analysis. As a case study, we apply this workflow to simulations of variably saturated groundwater flow at a prospective managed aquifer recharge (MAR) site. We compare the accuracy and computational efficiency of several ML architectures, including deep convolutional networks, recurrent neural networks, vision transformers, and networks with Fourier transforms. Our results demonstrate that ML surrogate models can achieve under 10% mean absolute percentage error and yield order-of-magnitude runtime savings over processed-based models. We also offer practical recommendations for training hydrologic surrogate models, including implementing data normalization to improve accuracy, using a normalized loss function to improve training stability and downsampling input features to decrease memory requirements.

Geophysics,Machine Learning

What problem does this paper attempt to address?

The paper aims to address the computational efficiency limitations of process-based hydrological modeling, particularly in applications under water resource management scenarios such as Managed Aquifer Recharge (MAR). Specifically, traditional hydrological models, while capable of accurately simulating the terrestrial water cycle and addressing modern water resource issues, incur extremely high computational costs when applied at high resolutions and large scales. A single simulation may take hours or even days. This is a significant computational burden for scenarios requiring hundreds of simulations to support uncertainty quantification, parameter estimation, and optimization decision analysis. To solve this problem, the researchers propose a hybrid modeling workflow that combines the high accuracy of process-based models with the computational efficiency of machine learning models. The specific method is as follows: 1. **Initial Stage**: First, use a process-based hydrological model (e.g., ParFlow-CLM) to generate an initial set of simulation datasets. 2. **Training Stage**: Then, use this dataset to train a machine learning surrogate model to perform the large number of simulations required subsequently. 3. **Application Stage**: Once training is complete, the surrogate model is used to quickly generate the remaining simulation results for further analysis, such as uncertainty quantification or resource optimization. As a case study, the authors applied this hybrid workflow to simulate changes in the saturated zone storage at a MAR site in Tulare County, California. By comparing different machine learning architectures (including deep convolutional networks, recurrent neural networks, vision transformers, and networks with Fourier transforms), they demonstrated that the surrogate model could achieve predictions with less than 10% mean absolute percentage error and save orders of magnitude in runtime compared to process-based models. Additionally, the study provides practical recommendations on how to train hydrological surrogate models, including data normalization to improve accuracy, using normalized loss functions to enhance training stability, and downsampling high-dimensional input features to reduce memory requirements. These findings are of significant guidance for practitioners looking to leverage machine learning to accelerate hydrological simulations.

Machine learning surrogates for efficient hydrologic modeling: Insights from stochastic simulations of managed aquifer recharge

A Surrogate Modelling Approach Based on Nonlinear Dimension Reduction for Uncertainty Quantification in Groundwater Flow Models

A Surrogate Model for the Variable Infiltration Capacity Model Using Deep Learning Artificial Neural Network

Surrogate optimization of deep neural networks for groundwater predictions

Hybrid Data-Driven Models for Hydrological Simulation and Projection on the Catchment Scale

Can artificial intelligence and data-driven machine learning models match or even replace process-driven hydrologic models for streamflow simulation?: A case study of four watersheds with different hydro-climatic regions across the CONUS

Exciton energy transfer between optically forbidden states of molecular aggregates.

Methods to improve run time of hydrologic models: opportunities and challenges in the machine learning era

A Physical Process and Machine Learning Combined Hydrological Model for Daily Streamflow Simulations of Large Watersheds with Limited Observation Data

Machine Learning Improvement of Streamflow Simulation by Utilizing Remote Sensing Data and Potential Application in Guiding Reservoir Operation

Enhancing physically-based hydrological modeling with an ensemble of machine-learning reservoir operation modules under heavy human regulation using easily accessible data

Performance evaluation of ML techniques in hydrologic studies: Comparing streamflow simulated by SWAT, GR4J, and state-of-the-art ML-based models

Learning Surrogate Rainfall-driven Inundation Models with Few Data

Generating interpretable rainfall-runoff models automatically from data

Machine learning-based monitoring and design of managed aquifer rechargers for sustainable groundwater management: scope and challenges

Improving Hydrological Modeling with Hybrid Models: A Comparative Study of Different Mechanisms for Coupling Deep Learning Models with Process-based Models

A New Benchmark on Machine Learning Methodologies for Hydrological Processes Modelling: A Comprehensive Review for Limitations and Future Research Directions

Efficient simulation of flood events using machine learning

Uncertainty-based saltwater intrusion prediction using integrated Bayesian machine learning modeling (IBMLM) in a deep aquifer

Bayesian Machine Learning Ensemble Approach to Quantify Model Uncertainty in Predicting Groundwater Storage Change.

Enhancing Flood Simulation in Data-Limited Glacial River Basins through Hybrid Modeling and Multi-Source Remote Sensing Data