Prediction of the effluent chemical oxygen demand and volatile fatty acids for anaerobic treatment based on different feature selections machine-learning methods from lab-scale to pilot-scale
Gang Ye,Jinquan Wan,Yuwei Bai,Yan Wang,Bin Zhu,Zhifei Zhang,Zhicheng Deng
DOI: https://doi.org/10.1016/j.jclepro.2024.140679
IF: 11.1
2024-01-14
Journal of Cleaner Production
Abstract:The effluent chemical oxygen demand (COD) and volatile fatty acids (VFA) are important indicators for measuring wastewater anaerobic treatment systems. Through lab and pilot experiments on the anaerobic treatment process of high-salt wastewater, combined with different prediction requirements, the feature variables are classified. The effluent-COD and VFA prediction models of five different Machine-learning methods (i.e., back propagation neural network (BPNN), Genetic algorithm optimizes BPNN(GA-BP), support vector machine (SVM), Least squares support vector machine (LSSVM), and random forest (RF)) were established. The modeling results on different dataset (i.e., lab dataset, pilot dataset and mixed dataset) showed that the SVM and LSSVM methods based on statistical learning had the best model prediction performance in lab and pilot datasets, and the RF method performed well in mixed dataset. Furthermore, the genetic algorithm was adopted to optimize the RF model. After optimization of mixed dataset, the effluent-COD prediction (i.e., R 2 = 0.966, RMSE/MAE are 61.227/48.584 mg/L) and VFA prediction (i.e., R 2 = 0.639, RMSE/MAE are 26.838/21.831 mg/L) have been improved compared with the original RF model (i.e., R 2 = 0.918, RMSE/MAE are 86.232/65.03 mg/L of effluent-COD, and R 2 = 0.539, RMSE/MAE are 35.489/26.726 mg/L of VFA). This research can provide a reference for the prediction of the anaerobic treatment of actual large-scale industrial wastewater.
environmental sciences,green & sustainable science & technology,engineering, environmental