Estimating total organic carbon of potential source rocks in the Espírito Santo Basin, SE Brazil, using XGBoost

Fellippe R.A. Bione,Igor M. Venancio,Thiago P. Santos,Andre L. Belem,Bernardo R. Rangel,Igor V.A.F. Souza,Andre L.D. Spigolon,Ana Luiza S. Albuquerque
DOI: https://doi.org/10.1016/j.marpetgeo.2024.106765
IF: 5.361
2024-02-21
Marine and Petroleum Geology
Abstract:Identifying and constraining source rocks is critical for petroleum system modeling and risk assessment. Traditional methods for total organic carbon (TOC) estimation based on well logs have limitations, leading to the emergence of machine learning techniques like XGBoost. This study compiled a comprehensive data set of well log and geochemical data from the Espírito Santo Basin, SE Brazil, and XGBoost, which was integrated with pySpark, was used for running multiple machine learning models to predict TOC. Parameter tuning was performed by randomly combining model configurations over multiple replication data frames. XGBoost effectively predicted TOC, yielding a coefficient of determination R 2 of 0.71, RMSE of 0.55 and MAE of 0.30, based on the average of all 10-fold cross-validation test sets. Heteroscedasticity was observed, possibly related to the presence of outliers in the target TOC variable, which may be linked to variable organic-matter deposition and preservation processes through the geological time, such as during Oceanic Anoxic Events (OAEs). The results indicate the potential of machine learning for TOC prediction in large, heterogeneous data sets, outperforming the traditional ΔlogR method and offering a promising tool for the usage of available public data sets in similar applications, such as the oil and gas (O&G) industry's exploration phase or field reassessment.
geosciences, multidisciplinary
What problem does this paper attempt to address?