Abstract:Abstract Due to high demand for energy, oil and gas companies started to drill wells in remote areas and unconventional environments. This raised the complexity of drilling operations, which were already challenging and complex. To adapt, drilling companies expanded their use of the real-time operation center (RTOC) concept, in which real-time drilling data are transmitted from remote sites to companies’ headquarters. In RTOC, groups of subject matter experts monitor the drilling live and provide real-time advice to improve operations. With the increase of drilling operations, processing the volume of generated data is beyond a human's capability, limiting the RTOC impact on certain components of drilling operations. To overcome this limitation, artificial intelligence and machine learning (AI/ML) technologies were introduced to monitor and analyze the real-time drilling data, discover hidden patterns, and provide fast decision-support responses. AI/ML technologies are data-driven technologies, and their quality relies on the quality of the input data: if the quality of the input data is good, the generated output will be good; if not, the generated output will be bad. Unfortunately, due to the harsh environments of drilling sites and the transmission setups, not all of the drilling data is good, which negatively affects the AI/ML results. The objective of this paper is to utilize AI/ML technologies to improve the quality of real-time drilling data. The paper fed a large real-time drilling dataset, consisting of over 150,000 raw data points, into Artificial Neural Network (ANN), Support Vector Machine (SVM) and Decision Tree (DT) models. The models were trained on the valid and not-valid datapoints. The confusion matrix was used to evaluate the different AI/ML models including different internal architectures. Despite the slowness of ANN, it achieved the best result with an accuracy of 78%, compared to 73% and 41% for DT and SVM, respectively. The paper concludes by presenting a process for using AI technology to improve real-time drilling data quality. To the author's knowledge based on literature in the public domain, this paper is one of the first to compare the use of multiple AI/ML techniques for quality improvement of real-time drilling data. The paper provides a guide for improving the quality of real-time drilling data.

Improving Data Quality through Deep Learning and Statistical Models

Assessing Data Quality Within Available Context

From Data Quality to Model Quality: an Exploratory Study on Deep Learning

A Theoretical Framework for AI-driven data quality monitoring in high-volume data environments

Statistical Learning to Operationalize a Domain Agnostic Data Quality Scoring

Overview and Importance of Data Quality for Machine Learning Tasks

Data Quality for Deep Learning of Judgment Documents: an Empirical Study

Big Data Quality Prediction in the Process Industry: A Distributed Parallel Modeling Framework

Data collection and quality challenges in deep learning: a data-centric AI perspective

The Effects of Data Quality on Machine Learning Performance

Data collection and quality challenges for deep learning

Towards Explainable Automated Data Quality Enhancement without Domain Knowledge

A Survey on Data Quality Dimensions and Tools for Machine Learning

AI-Driven Frameworks for Enhancing Data Quality in Big Data Ecosystems: Error_Detection, Correction, and Metadata Integration

Data Evaluation and Enhancement for Quality Improvement of Machine Learning

DQI: Measuring Data Quality in NLP

Data Quality Matters: A Case Study of Obsolete Comment Detection.

RESEARCH AND APPLICATION OF ACTIVE QUALITY IMPROVEMENT MODEL BASED ON DATA DECISION

Improving Real-Time Drilling Data Quality Using Artificial Intelligence and Machine Learning Techniques

A Data-centric Framework for Improving Domain-specific Machine Reading Comprehension Datasets