Adaptive Process-Guided Learning: An Application in Predicting Lake DO Concentrations

Runlong Yu,Chonghao Qiu,Robert Ladwig,Paul C. Hanson,Yiqun Xie,Yanhua Li,Xiaowei Jia
2024-11-20
Abstract:This paper introduces a \textit{Process-Guided Learning (Pril)} framework that integrates physical models with recurrent neural networks (RNNs) to enhance the prediction of dissolved oxygen (DO) concentrations in lakes, which is crucial for sustaining water quality and ecosystem health. Unlike traditional RNNs, which may deliver high accuracy but often lack physical consistency and broad applicability, the \textit{Pril} method incorporates differential DO equations for each lake layer, modeling it as a first-order linear solution using a forward Euler scheme with a daily timestep. However, this method is sensitive to numerical instabilities. When drastic fluctuations occur, the numerical integration is neither mass-conservative nor stable. Especially during stratified conditions, exogenous fluxes into each layer cause significant within-day changes in DO concentrations. To address this challenge, we further propose an \textit{Adaptive Process-Guided Learning (April)} model, which dynamically adjusts timesteps from daily to sub-daily intervals with the aim of mitigating the discrepancies caused by variations in entrainment fluxes. \textit{April} uses a generator-discriminator architecture to identify days with significant DO fluctuations and employs a multi-step Euler scheme with sub-daily timesteps to effectively manage these variations. We have tested our methods on a wide range of lakes in the Midwestern USA, and demonstrated robust capability in predicting DO concentrations even with limited training data. While primarily focused on aquatic ecosystems, this approach is broadly applicable to diverse scientific and engineering disciplines that utilize process-based models, such as power engineering, climate science, and biomedicine.
Machine Learning
What problem does this paper attempt to address?
The problem that this paper attempts to solve is to improve the accuracy and physical consistency of lake dissolved oxygen (DO) concentration prediction. Specifically, traditional artificial intelligence models such as recurrent neural networks (RNNs) can provide high - precision predictions, but often lack physical consistency and wide applicability. In addition, traditional physics - based process models, although following basic principles such as mass conservation, are prone to numerical instability problems in complex situations (such as drastic fluctuations under lake stratification conditions). To solve these problems, the author introduced a new framework - Process - Guided Learning (Pril), and further proposed Adaptive Process - Guided Learning (April). These methods aim to improve the prediction of lake dissolved oxygen concentration by combining the advantages of physical models and machine - learning models. ### Main problems and solutions 1. **Physical consistency and wide applicability**: - **Pril model**: By incorporating the dissolved oxygen differential equation for each lake layer into the loss function, it ensures that the prediction results conform to known physical relationships (such as mass conservation), thereby improving the physical consistency and generalization ability of the prediction. - **April model**: Based on Pril, an adaptive time - step adjustment mechanism is proposed to deal with the possible drastic fluctuations under lake stratification conditions. This is achieved through a generator - discriminator architecture, which can automatically identify and handle days with significant fluctuations and use a multi - step Euler scheme for finer time - step simulation. 2. **Numerical stability**: - Traditional numerical methods (such as the Euler method) may lead to unstable numerical integration when encountering drastic changes, which is neither conservative nor stable. Especially under the summer stratification conditions of deep lakes, the volume of the lower water body changes rapidly, resulting in a large fluctuation in the dissolved oxygen concentration in a short time. - The April model effectively alleviates the significant differences caused by the inflow or outflow changes caused by turbulence by dynamically adjusting the time step from daily to sub - daily, ensuring the stability of numerical calculations and mass conservation. 3. **Data sparsity and generalization ability**: - By combining physical constraints, the need for a large amount of training data is reduced, so that the model can maintain high prediction performance even in the case of scarce data. - In the experiment, the author demonstrated the application of these methods on multiple lakes in the Midwest of the United States, proving that they still have strong prediction ability under limited training data and are sensitive to subtle changes. In conclusion, this paper solves the problems of insufficient physical consistency, numerical instability, and data sparsity in traditional methods for predicting lake dissolved oxygen concentration by introducing Pril and April models, providing more reliable technical means for the monitoring and management of aquatic ecosystems.