Auto-tuning data-driven model for biogas yield prediction from anaerobic digestion of sewage sludge at the south-tehran wastewater treatment plant: Feature selection and hyperparameter population-based optimization

Farzad Farzin,Shabnam Sadri Moghaddam,Majid Ehteshami
DOI: https://doi.org/10.1016/j.renene.2024.120554
IF: 8.7
2024-05-03
Renewable Energy
Abstract:In this study, two data-driven models, artificial neural networks and support vector regression (SVR), have been trained and optimized to predict the biogas yield from anaerobic digesters at the South-Tehran municipal wastewater treatment plant. The auto-tuning approach, including feature selection and hyperparameter population-based optimization, was applied through the genetic algorithm (GA) and particle swarm optimization to improve the training models' performance and help them obtain the best input parameters. The Shapley Additive Explanations (SHAP) analysis was also done to interpret models effectively and assign credit for a model's prediction to each feature. The findings demonstrated that biogas prediction using SVR-GA achieved the highest accuracy, with R 2 values of 0.725 and 0.773, and RMSE values (regarding normalized datasets) of 0.477 and 0.492 for the train and test, respectively, while requiring the least computational time compared to other models. The auto-tuning technique, by removing the less important inputs, was able to show that temperature, pH, effluent and influent dry solids, effluent volatile solids (VS), and influent VS of waste sludge were the best input parameters for optimal biogas production modeling. The SHAP analysis revealed that VS eff and temperature were two of the most important features affecting biogas production, exhibiting an inverse impact.
energy & fuels,green & sustainable science & technology
What problem does this paper attempt to address?