A Tri-Training method for lithofacies identification under scarce labeled logging data

Xinyi Zhu,Hongbing Zhang,Quan Ren,Dailu Zhang,Fanxing Zeng,Xinjie Zhu,Lingyuan Zhang
DOI: https://doi.org/10.1007/s12145-023-00986-w
2023-03-08
Earth Science Informatics
Abstract:Lithofacies identification is critical to energy exploration and reservoir evaluation. Machine learning provides a way to use logging data for lithofacies intelligence identification. However, labeled logging data are usually scarce, which makes the currently used supervised algorithms less effective, so semi-supervised methods have received attention from researchers. In this paper, we propose to apply Tri-Training to the field of lithofacies recognition. The framework used Random Forest (RF), Gradient-Boosted Decision Trees (GBDT), and Support Vector Machine (SVM), as the baseline supervised classifiers, and based on the idea of inductive semi-supervised methods and ensemble learning. Baseline classifiers are trained and iterated using unlabeled data to obtain effect improvement. The final results are output in an ensemble paradigm. We used seven logging parameters from two wells as input and divide the data randomly 10 times for training and testing. With only five samples of each lithology, the prediction accuracy improved by the average of 2.1% and 14.5% in both wells compared to the baseline methods. In addition, we also compared two commonly used semi-supervised methods, label propagation algorithm (LPA) and Co-Training. The experimental results also confirm that Tri-training has the better and more stable performance. The Tri-training method in this paper can be effectively applied to lithofacies identification under scarce labeled logging data.
geosciences, multidisciplinary,computer science, interdisciplinary applications
What problem does this paper attempt to address?