Abstract:This paper introduces a \textit{Process-Guided Learning (Pril)} framework that integrates physical models with recurrent neural networks (RNNs) to enhance the prediction of dissolved oxygen (DO) concentrations in lakes, which is crucial for sustaining water quality and ecosystem health. Unlike traditional RNNs, which may deliver high accuracy but often lack physical consistency and broad applicability, the \textit{Pril} method incorporates differential DO equations for each lake layer, modeling it as a first-order linear solution using a forward Euler scheme with a daily timestep. However, this method is sensitive to numerical instabilities. When drastic fluctuations occur, the numerical integration is neither mass-conservative nor stable. Especially during stratified conditions, exogenous fluxes into each layer cause significant within-day changes in DO concentrations. To address this challenge, we further propose an \textit{Adaptive Process-Guided Learning (April)} model, which dynamically adjusts timesteps from daily to sub-daily intervals with the aim of mitigating the discrepancies caused by variations in entrainment fluxes. \textit{April} uses a generator-discriminator architecture to identify days with significant DO fluctuations and employs a multi-step Euler scheme with sub-daily timesteps to effectively manage these variations. We have tested our methods on a wide range of lakes in the Midwestern USA, and demonstrated robust capability in predicting DO concentrations even with limited training data. While primarily focused on aquatic ecosystems, this approach is broadly applicable to diverse scientific and engineering disciplines that utilize process-based models, such as power engineering, climate science, and biomedicine.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is to improve the accuracy and physical consistency of lake dissolved oxygen (DO) concentration prediction. Specifically, traditional artificial intelligence models such as recurrent neural networks (RNNs) can provide high - precision predictions, but often lack physical consistency and wide applicability. In addition, traditional physics - based process models, although following basic principles such as mass conservation, are prone to numerical instability problems in complex situations (such as drastic fluctuations under lake stratification conditions). To solve these problems, the author introduced a new framework - Process - Guided Learning (Pril), and further proposed Adaptive Process - Guided Learning (April). These methods aim to improve the prediction of lake dissolved oxygen concentration by combining the advantages of physical models and machine - learning models. ### Main problems and solutions 1. **Physical consistency and wide applicability**: - **Pril model**: By incorporating the dissolved oxygen differential equation for each lake layer into the loss function, it ensures that the prediction results conform to known physical relationships (such as mass conservation), thereby improving the physical consistency and generalization ability of the prediction. - **April model**: Based on Pril, an adaptive time - step adjustment mechanism is proposed to deal with the possible drastic fluctuations under lake stratification conditions. This is achieved through a generator - discriminator architecture, which can automatically identify and handle days with significant fluctuations and use a multi - step Euler scheme for finer time - step simulation. 2. **Numerical stability**: - Traditional numerical methods (such as the Euler method) may lead to unstable numerical integration when encountering drastic changes, which is neither conservative nor stable. Especially under the summer stratification conditions of deep lakes, the volume of the lower water body changes rapidly, resulting in a large fluctuation in the dissolved oxygen concentration in a short time. - The April model effectively alleviates the significant differences caused by the inflow or outflow changes caused by turbulence by dynamically adjusting the time step from daily to sub - daily, ensuring the stability of numerical calculations and mass conservation. 3. **Data sparsity and generalization ability**: - By combining physical constraints, the need for a large amount of training data is reduced, so that the model can maintain high prediction performance even in the case of scarce data. - In the experiment, the author demonstrated the application of these methods on multiple lakes in the Midwest of the United States, proving that they still have strong prediction ability under limited training data and are sensitive to subtle changes. In conclusion, this paper solves the problems of insufficient physical consistency, numerical instability, and data sparsity in traditional methods for predicting lake dissolved oxygen concentration by introducing Pril and April models, providing more reliable technical means for the monitoring and management of aquatic ecosystems.

Adaptive Process-Guided Learning: An Application in Predicting Lake DO Concentrations

Evaluating a process‐guided deep learning approach for predicting dissolved oxygen in streams

Predicting lake surface water phosphorus dynamics using process-guided machine learning

Artificial Neural Network Modelling of Concentrations of Nitrogen, Phosphorus and Dissolved Oxygen in a Non‐point Source Polluted River in Zhejiang Province, Southeast China

Multi‐Model Machine Learning Approach Accurately Predicts Lake Dissolved Oxygen With Multiple Environmental Inputs

Enhanced predictive modeling of dissolved oxygen concentrations in riverine systems using novel hybrid temporal pattern attention deep neural networks

PID4LaTe: a physics-informed deep learning model for lake multi-depth temperature prediction

From Hydrometeorology to River Water Quality: Can a Deep Learning Model Predict Dissolved Oxygen at the Continental Scale?

A deep learning interpretable model for river dissolved oxygen multi-step and interval prediction based on multi-source data fusion

Physics-Guided Machine Learning for Scientific Discovery: An Application in Simulating Lake Temperature Profiles

A hybrid XGBoost-ISSA-LSTM model for accurate short-term and long-term dissolved oxygen prediction in ponds

Physics-guided spatio–temporal neural network for predicting dissolved oxygen concentration in rivers

Large-Scale Prediction of Stream Water Quality Using an Interpretable Deep Learning Approach

Predicting abrupt depletion of dissolved oxygen in Chaohu lake using CNN-BiLSTM with improved attention mechanism

Water Quality Prediction Based on Hybrid Deep Learning Algorithm

Interpretable prediction, classification and regulation of water quality: A case study of Poyang Lake, China

A process-driven deep learning hydrological model for daily rainfall-runoff simulation

Prediction of riverine daily minimum dissolved oxygen concentrations using hybrid deep learning and routine hydrometeorological data

Estimation of the Biogeochemical and Physical Properties of Lakes Based on Remote Sensing and Artificial Intelligence Applications

Differentiable, learnable, regionalized process-based models with physical outputs can approach state-of-the-art hydrologic prediction accuracy

Dissolved oxygen prediction using regularized extreme learning machine with clustering mechanism in a black bass aquaculture pond