Explainable artificial intelligence for the interpretation of ensemble learning performance in algal bloom estimation

Jungsu Park,Byeongchan Seong,Yeonjeong Park,Woo Hyoung Lee,Tae‐Young Heo
DOI: https://doi.org/10.1002/wer.11140
2024-10-11
Water Environment Research
Abstract:XAI quantifies the effects of environmental factors on algal bloom prediction models. Selecting highly‐importance variables through XAI analysis shows stable modeling performance not using all variables. Chlorophyll‐a (Chl‐a) concentrations, a key indicator of algal blooms, were estimated using the XGBoost machine learning model with 23 variables, including water quality and meteorological factors. The model performance was evaluated using three indices: root mean square error (RMSE), RMSE‐observation standard deviation ratio (RSR), and Nash–Sutcliffe efficiency. Nine datasets were created by averaging 1 hour data to cover time frequencies ranging from 1 hour to 1 month. The dataset with relatively high observation frequencies (1–24 h) maintained stability, with an RSR ranging between 0.61 and 0.65. However, the model's performance declined significantly for datasets with weekly and monthly intervals. The Shapley value (SHAP) analysis, an explainable artificial intelligence method, was further applied to provide a quantitative understanding of how environmental factors in the watershed impact the model's performance and is also utilized to enhance the practical applicability of the model in the field. The number of input variables for model construction increased sequentially from 1 to 23, starting from the variable with the highest SHAP value to that with the lowest. The model's performance plateaued after considering five or more variables, demonstrating that stable performance could be achieved using only a small number of variables, including relatively easily measured data collected by real‐time sensors, such as pH, dissolved oxygen, and turbidity. This result highlights the practicality of employing machine learning models and real‐time sensor‐based measurements for effective on‐site water quality management. Practitioner Points XAI quantifies the effects of environmental factors on algal bloom prediction models The effects of input variable frequency and seasonality were analyzed using XAI XAI analysis on key variables ensures cost‐effective model development
environmental sciences,engineering, environmental,water resources,limnology
What problem does this paper attempt to address?