Chemometric Classification of Crude Oils in Complex Petroleum Systems Using T-Distributed Stochastic Neighbor Embedding Machine Learning Algorithm

Keyu Tao,Jian Cao,Yuce Wang,Julei Mi,Wanyun Ma,Chunhua Shi
DOI: https://doi.org/10.1021/acs.energyfuels.0c01333
IF: 4.6541
2020-01-01
Energy & Fuels
Abstract:The origin of crude oils is fundamental in the study of petroleum systems, but it encounters difficulties in complex systems because traditional geochemistry proxies are influenced by multiple factors (e.g., oil mixing, secondary alteration) and the interpretation of the data is challenging. To develop new potential approaches, a pilot study using the t-distributed stochastic neighbor embedding (t-SNE) machine learning algorithm was performed, based on a case study of the saline and alkaline lake petroleum systems in the lower Permian Mahu Sag, northwestern Junggar Basin, China. The algorithm revealed three main types of alkaline lacustrine related source rocks in the studied Fengcheng Formation: (i) argillaceous rocks deposited in brackish water and a weakly reducing environment; (ii) dolomitic mudstones deposited in saline water and a reducing environment; (iii) argillaceous dolomites deposited in hypersaline water and a strongly reducing environment. These organic facies are not time equivalent and vary temporally and spatially in the context of the alkaline lake evolution. Analysis of 43 crude oil samples showed that 5, 48, and 42% of the total number of samples were derived from argillaceous, dolomitic mudstone, and argillaceous dolomite source rocks, respectively, while the remaining 5% oil samples had a mixed origin from the former two end members. This suggests that hydrocarbon generation in the Fengcheng petroleum systems results mainly in large-scale oil generation from dolomitic source rocks. The biological precursors in the dolomitic rocks are dominated by haloduric algae, and the oil generation window is prolonged through organic-inorganic interactions during the hydrocarbon generation. This might be favorable for the preservation of an oil phase during deep burial and at high maturity. This represents a shale oil accumulation system in general as the source rocks and oils are the within the Fengcheng sequence. Our data suggest that the machine learning algorithm can find further application in this field with promising prospects.
What problem does this paper attempt to address?