Hybrid Iterative and Tree-Based Machine Learning Algorithms for Lake Water Level Forecasting

Elham Fijani,Khabat Khosravi
DOI: https://doi.org/10.1007/s11269-023-03613-x
IF: 4.426
2023-09-17
Water Resources Management
Abstract:Accurate forecasting of lake water level (WL) fluctuations is essential for effective development and management of water resource systems. This study applies the Random Tree (RT) algorithm and the Iterative Classifier Optimizer (ICO), which is based on the Alternating Model Tree (AMT) as an iterative regressor, to forecast WL up to three months ahead for Lake Superior and Lake Michigan. To enhance the accuracy of these machine learning (ML) algorithms, their forecasts are combined using ensemble algorithms such as Bagging (BA) or Additive Regression (AR), resulting in BA-RT, BA-ICO, AR-RT, and AR-ICO models. The most effective inputs for WL forecasting are determined using a nonlinear input variable selection method called partial mutual information selection (PMIS), considering lagged WL values up to 24 months. Forecasting models for each lake are developed using a training subset spanning from 1918 to 1988. The models' parameters are tuned using a validation subset covering 1989 to 2003. Finally, model performance is evaluated using a testing subset from 2004 to 2018. Statistical metrics and visual analysis with testing data are used to validate the performance of the developed algorithms. Additionally, results obtained from Seasonal Autoregressive Integrated Moving Average (SARIMA) time series models serve as benchmarks for comparison with ML results. The findings demonstrate that ML models outperform SARIMA models in terms of error values: RMSPE ranges between 3.9% and 11.3% for Lake Michigan and between 2.3% and 9.2% for Lake Superior respectively. Furthermore, both hybrid ensemble algorithms improve individual ML algorithm performance; however, the BA algorithm achieves better overall performance compared to the AR algorithm. As a novel approach in forecasting problems, ICO algorithm based on AMT shows great potential in generating accurate multistep forecasts of lake WL. It demonstrates high generalization and low variance compared to the RT model.
water resources,engineering, civil
What problem does this paper attempt to address?