Simulating wastewater treatment plants for heavy metals using machine learning models

Marwan Kheimi,Mohammad A. Almadani,Mohammad Zounemat-Kermani
DOI: https://doi.org/10.1007/s12517-022-10736-9
2022-08-26
Arabian Journal of Geosciences
Abstract:To achieve better prediction accuracy and robustness, three types of ensemble machine learning such as bagging, boosting, and XGBoost are developed and appraised for the prediction of effluent heavy metals at wastewater treatment plants. Nine potential independent influent parameters were considered for predicting the dissolved concentration of Cr, Cu, and Zn. The predicted heavy metal effluent values were evaluated by (i) statistical measures including deviance criteria (RMSE and MAE), bias criterion (MBE), and efficiency criteria (PCC and NSE), (ii) analysis of residuals, and (iii) the t -paired test at α = 0.05. The outcomes of the feature selection method suggest that the turbidity, flow rate, and hexane extractable material factors mainly affect the predictive results. On the contrary, influent qualitative factors such as BOD, TSS, pH, and temperature (T) do not have any impact on effluent heavy metal parameters. The ensemble XGBoost model outperformed the other machine learning as well as the MLR statistical models for all the three target parameters (average RMSE improvement = 25% for Cr, 24% for Cu, and 14% for Zn). Although the bagging technique acted better than the traditional MLR model, this ensemble method did not enhance the performance of the base model. The outcome suggests that a bagging approach does not necessarily mean more accurate performance than the base model. However, the bagging technique gave more resilient and robust predicated values. Overall, in terms of accuracy and robustness, it is preferable to select an appropriate boosting model (here as the XGBoost) for dealing with complicated aquatic quality parameters.
geosciences, multidisciplinary
What problem does this paper attempt to address?