Enhancing AI model robustness in organic pollutant adsorption forecasting: Insights from feature analysis

Ana Carolina Ferreira Piazzi Fuhr,Matias Schadeck Netto,Yasmin Vieira,Guilherme Luiz Dotto,Nina Paula Gonçalves Salau
DOI: https://doi.org/10.1016/j.seppur.2024.130497
IF: 8.6
2024-11-24
Separation and Purification Technology
Abstract:Despite the significant increase in the use of artificial intelligence (AI) models in adsorption, there is still a lack of comprehension on how the input variables can affect the outputs of each model. In this study, we employed data provided by the standard removal of two organic pollutants by pyrolyzed soybean hulls to demonstrate how analyzing the correlation between features is essential. The experiments were conducted at 298 K and pH 6, generating 270 experimental points. The features to be used as input variables for the models were chosen after a correlation analysis that excluded highly correlated ones. The models, Gradient Boosting (GB), Artificial Neural Networks (ANN), and Adaptive Neuro-Fuzzy Inference System (ANFIS) were evaluated to determine the best performance in prediction and generalization. All models showed good statistical performance, with R2 0.9906 – 0.9991, MSE 0.0666 – 0.0099, and MAPE 4.7927 – 0.6939 %. As the complexity of the model increased, its performance improved, and ANFIS was the model that best predicted the data. The analysis of the feature importance showed that time and initial concentration are the essential variables. Thus, this study provides how the input information can alter the prediction and surpass the capabilities of models fed with indiscriminate data, highlighting the importance of carefully selecting parameters to ensure that even the simplest model can be robust and reproducible.
engineering, chemical
What problem does this paper attempt to address?