Modeling the optimal dosage of coagulants in water treatment plants using various machine learning models
Mohammed Achite,Saeed Farzin,Nehal Elshaboury,Mahdi Valikhan Anaraki,Mohammed Amamra,Abderrezak Kamel Toubal
DOI: https://doi.org/10.1007/s10668-022-02835-0
2022-12-29
Abstract:One of the main methods for determining coagulant dosage (CD) is the jar test. However, this method is expensive, time-consuming, and requires laboratory equipment. In this situation, machine learning, especially hybrid machine learning, is suitable to estimate the CD without requiring a jar test. To this end, a hybrid model based on a combination of the M5 and the gorilla troops optimizer (GTO) algorithm was introduced as the M5-GTO model. Nine different parameters, including raw water production (RWP), turbidity of water, Conductivity, TDS, Salinity, pH, water temperature (WT), SM, and O 2 , were also utilized as inputs in the CD modeling. The results of comparing the proposed model with the multiple linear regression, multiple nonlinear regression, artificial neural network, multivariate adaptive regression splines, M5 model tree, k -nearest neighbor, least-squares support vector machine, general regression neural network, and random forest (RF) showed that this model is more accurate in CD modeling. Besides, the M5-GTO well estimated the distribution of observational data. For example, the values of the mean absolute error, root mean square error (RMSE), relative RMSE, normalized RMSE, and correlation coefficient criteria for the M5-GTO were equal to 0.562, 1.172, 0.257, 0.170, and 0.967, which were up to 73%, 54.9%, 54.8%, 55%, and 4% are more accurate than LSSVM (worst algorithm), respectively. In addition, the M5 and RF algorithms were also in the second and third ranks. The partial dependence plots results showed that the RWP and WT had the most significant effects on CD changes. Increasing the RWP reduced the amount of CD, while increasing the WT increased the CD. The algorithm introduced in the present study has a high potential for modeling various parameters in water treatment plants.