Predicting Absolute Adsorption of CO2 on Jurassic Shale Using Machine Learning

Changhui Zeng,Shams Kalam,Haiyang Zhang,Lei Wang,Yi Luo,Haizhu Wang,Zongjie Mu,Muhammad Arif
DOI: https://doi.org/10.1016/j.fuel.2024.133050
IF: 7.4
2025-01-01
Fuel
Abstract:The injection of carbon dioxide (CO2) into shales has the potential for enhancing shale gas production as well as CO2 storage within shale repository. The key mechanism for CO2 storage in shales is the adsorption of CO2 in organic-rich pores and partly in clay minerals. While adsorption of CO2 in shales has been extensively studied via laboratory experiments and molecular simulations, robust methods of accurate predictions of CO2 adsorption in shales are still lacking. This paper proposes a novel method based on machine learning to predict the adsorption behavior of CO2 in shales. A total of 194 datasets of pure CO2 adsorption in shales were collected from the literature. The dataset was trained and validated using the random forest regression (RF), support vector regression (SVR), XGBoost, and multilayer perceptron (MLP) models. The input variables of the predictive models include pressure, temperature, total organic carbon (TOC), and inorganic minerals e.g., quartz, feldspar, illite, kaolinite, and pyrite, while the corresponding output variable is the absolute adsorption of CO2. The SVR model achieved an R-2 value of 0.9998 and had the lowest MSE (0.0026), RMSE (0.0510), and MAE (0.0217) using the training dataset. The predictive accuracy of these models, ranked from high to low, is SVR > MLP > XGBoost > RF. Adsorption isotherm modeling was also conducted and compared with the proposed SVR model. The Dubinin-Astakhov adsorption isotherm provided the best fit for all shale samples. Predictions from the SVR model were found comparable to those from the Dubinin-Astakhov adsorption model. The developed SVR model significantly reduces time compared to time-consuming laboratory experiments to accurately predict CO2 adsorption on shales. The proposed SVR model can be conveniently updated for broader applications as additional data becomes available.
What problem does this paper attempt to address?