Improving LSTM Hydrological Modeling with Spatiotemporal Deep Learning and Multi-Task Learning: A Case Study of Three Mountainous Areas on the Tibetan Plateau

Bu Li,Ruidong Li,Ting Sun,Aofan Gong,Fuqiang Tian,Mohd Yawar Ali Khan,Guangheng Ni
DOI: https://doi.org/10.1016/j.jhydrol.2023.129401
IF: 6.4
2023-01-01
Journal of Hydrology
Abstract:Long short-term memory (LSTM) networks have demonstrated their excellent capability in processing long -length temporal dynamics and have proven to be effective in precipitation-runoff modeling. However, the cur-rent LSTM hydrological models lack the incorporation of multi-task learning and spatial information, which limits their ability to make full use of meteorological and hydrological data. To address this issue, this study proposes a spatiotemporal deep-learning (DL)-based hydrological model that couples the 2-Dimension con-volutional neural network (CNN) and LSTM and introduces actual evaporation (Ea) as an additional training target. The proposed CNN-LSTM model is tested on three large mountainous basins on the Tibetan Plateau, and the results are compared to those obtained from the LSTM-only model. Additionally, a probe method is used to decipher the internal embedding layers of the proposed DL models. The results indicate that both LSTM and CNN-LSTM hydrological models perform well in simulating runoff (Q) and Ea, with Nash-Sutcliffe efficiency coefficients (NSEs) higher than 0.82 and 0.95, respectively. The higher NSEs suggest that introducing spatial information into LSTM-only models can improve the overall and peak model performance. Moreover, multi-task simulation with LSTM-only models shows better accuracy in the estimation of Q volume and performance, with NSEs increasing by approximately 0.02. The probe method also reveals that CNN can capture the basin-averaged meteorological values in CNN-LSTM models, while LSTM Q (Ea) models contain the information about the known Ea (Q) process. Overall, this study demonstrates the value of spatial information and multi-task learning in LSTM hydrological modeling and provides a perspective for interpreting the internal embedding layers of DL models.
What problem does this paper attempt to address?