Floodplain Lake Water Level Prediction with Strong River-Lake Interaction Using the Ensemble Learning LightGBM

Min Gan,Xijun Lai,Yan Guo,Yongping Chen,Shunqi Pan,Yinghao Zhang
DOI: https://doi.org/10.1007/s11269-024-03915-8
IF: 4.426
2024-01-01
Water Resources Management
Abstract:A LightGBM-based water level prediction model was developed for floodplain lakes. The RMSE values of the model’s one-day-ahead prediction range from 0.09 to 0.10 m. The rank of the driving factors of Poyang Lake water level change was identified. Timely and accurate prediction of water levels is crucial for managing floodplain lakes with important ecosystem services, especially for flood prevention. Floodplain lakes are hydrologically complex and incredibly variable systems. Their water levels respond nonlinearly to external disturbances, complicating their prediction. Machine learning methods, especially the ensemble learning strategy, provide an excellent choice for solving such nonlinear problems. Based on the Light Gradient Boosting Machine (LightGBM) method, a water level prediction model was developed for Poyang Lake, the largest freshwater lake in China, featuring seasonal flooding and drying under strong river-lake interaction. Daily lake water levels at typical stations and discharge of the Yangtze River and the five main inflowing rivers from 2003 to 2019 were collected for training and testing the proposed model. Combinations of input variables are important for the model performance. The past information of water levels in Poyang Lake can improve the prediction accuracy significantly. Compared with the predictions without considering past water level information, the root mean square error values of the model considering past water levels have decreased from 0.49 to 0.62 m to 0.09–0.10 m for one-day ahead prediction. Therefore, the proposed model with optimized input variables can be an effective tool for the water level prediction of floodplain lakes. Note that the prediction accuracy decreases with the extension of the prediction period. Importance rank analysis shows that the water levels at different typical stations of Poyang Lake have diverse responses to the Yangtze River and the main five inflowing rivers, providing new insights into river-lake interaction. It also implied that the lake regulation should be deliberately conducted for such a large lake with a complex flow regime.
What problem does this paper attempt to address?