Coastal water quality prediction based on machine learning with feature interpretation and spatio-temporal analysis

Luka Grbčić,Siniša Družeta,Goran Mauša,Tomislav Lipić,Darija Vukić Lušić,Marta Alvir,Ivana Lučin,Ante Sikirica,Davor Davidović,Vanja Travaš,Daniela Kalafatovic,Kristina Pikelj,Hana Fajković,Toni Holjević,Lado Kranjčević
DOI: https://doi.org/10.1016/j.envsoft.2022.105458
2022-07-17
Abstract:Coastal water quality management is a public health concern, as water of poor quality can potentially harbor dangerous pathogens. In this study, we employ routine monitoring data of EscherichiaColi and enterococci across 15 beaches in the city of Rijeka, Croatia, to build machine learning models for predicting E.Coli and enterococci based on environmental features. Cross-validation analysis showed that the Catboost algorithm performed best with R 2 values of 0.71 and 0.69 for predicting E.Coli and enterococci, respectively, compared to other evaluated algorithms. SHapley Additive exPlanations technique showed that salinity is the most important feature for forecasting both E.Coli and enterococci levels. Furthermore, for low water quality sites, the spatial predictive models achieved R 2 values of 0.85 and 0.83, while the temporal models achieved R 2 values of 0.74 and 0.67. The temporal model achieved moderate R 2 values of 0.44 and 0.46 at a site with high water quality.
environmental sciences,engineering, environmental,water resources,computer science, interdisciplinary applications
What problem does this paper attempt to address?