An interpretable surrogate model for H 2 S solubility forecasting in ionic liquids based on machine learning
Yanjiang He,Ao Yang,Changjun Zou,Tianyou Fan,Qikui Lan,Yu He,Meng Wang,Jaka Sunarso,Zong Yang Kong
DOI: https://doi.org/10.1016/j.seppur.2024.130061
IF: 8.6
2024-10-19
Separation and Purification Technology
Abstract:Here we investigated four different ML-based models, i.e., gaussian process regression (GPR), extreme gradient boosting (i.e., XGBoost), random forest (RF), and support vector machine (SVM), for predicting the solubility of H 2 S in various ionic liquids (ILs). The dataset was divided into training and testing sets in an 80:20 ratio while the model performance for all models were evaluated using the coefficient of determination (R 2 ), mean absolute error (MAE), and root mean square error (RMSE). Overall, all models effectively predicted H 2 S solubility, albeit with varying degrees of performance. The GPR provides the best performance, with R 2 of 0.9918, MAE of 0.0090, and RMSE of 0.0147. Following this is the XGBoost model with an R 2 value of 0.9827, MAE of 0.0155, and RMSE of 0.0213. The RF model displayed slightly lower performance, with an R 2 value of 0.9395, MAE of 0.0261, and RMSE of 0.0398 while the lowest performance was demonstrated by the SVM model, which gave an R 2 value of 0.9036, MAE of 0.0402, and RMSE of 0.0508. We used SHAP analysis and identified the pressure, temperature, Estate_VSA3, Estate_VSA5, and MinEStateIndex as the top five dominant input features in our model interpretation. In a nutshell, this work presents new insights into the molecular characteristics that affect the solubility of H 2 S in ILs, paving future research path in this field.
engineering, chemical