Improving Discharge Predictions in Ungauged Basins: Harnessing the Power of Disaggregated Data Modeling and Machine Learning

Aggrey Muhebwa,Colin J. Gleason,Dongmei Feng,Jay Taneja
DOI: https://doi.org/10.1029/2024wr037122
IF: 5.4
2024-09-20
Water Resources Research
Abstract:Current machine learning methods for discharge prediction often employ aggregated basin‐wide hydrometeorological data (lumped modeling) for parametric and non‐parametric training. This approach may overlook the spatial heterogeneity of river systems and their impact on discharge patterns. We hypothesize that integrating spatiotemporal hydrologic knowledge into the data modeling process (distributed/disaggregated modeling) can improve the performance of discharge prediction models. To test this hypothesis, we designed experiments comparing the performance of identical Long Short‐Term Memory Recurrent Neural Network (LSTM‐RNN) models forced with either lumped or distributed features. We gather meteorological forcing and static attributes for the Mackenzie basin in Canada‐ a large and unique basin. Importantly, discharge performance is assessed out‐of‐sample with k‐fold replication across gauges. Training LSTMs with disaggregated data significantly improved model accuracy. Specifically, there was a 9.6% increase in the mean Nash‐Sutcliffe Efficiency and a 4.6% increase in the mean Kling‐Gupta Efficiency, indicating a better agreement between predicted and actual observations in terms of mean, variability, and correlation. These experiments and results demonstrate the importance of integrating topologically guided geomorphologic and hydrologic information (distributed modeling) in data‐driven discharge predictions.
environmental sciences,water resources,limnology
What problem does this paper attempt to address?