Abstract:With a moderate- to low-oil-price environment being the new normal, improving process efficiency, thereby leading to hydrocarbon recovery at reduced costs, is becoming the need of the hour. The oil and gas industry generates vast amounts of data that, if properly leveraged, can generate insights that lead to recovering hydrocarbons with reduced costs, better safety records, lower costs associated with equipment downtime, and reduced environmental footprint. Data analytics and machine-learning techniques offer tremendous potential in leveraging the data. An analysis of papers in OnePetro from 2014 to 2020 illustrates the steep increase in the number of machine-learning-related papers year after year. The analysis also reveals reservoir characterization, formation evaluation, and drilling as domains that have seen the highest number of papers on the application of machine-learning techniques. Reservoir characterization in particular is a field that has seen an explosion of papers on machine learning, with the use of convolutional neural networks for fault detection, seismic imaging and inversion, and the use of classical machine-learning algorithms such as random forests for lithofacies classification. Formation evaluation is another area that has gained a lot of traction with applications such as the use of classical machine-learning techniques such as support vector regression to predict rock mechanical properties and the use of deep-learning techniques such as long short-term memory to predict synthetic logs in unconventional reservoirs. Drilling is another domain where a tremendous amount of work has been done with papers on optimizing drilling parameters using techniques such as genetic algorithms, using automated machine-learning frameworks for bit dull grade prediction, and application of natural language processing for stuck-pipe prevention and reduction of nonproductive time. As the application of machine learning toward solving various problems in the upstream oil and gas industry proliferates, explainable artificial intelligence or machine-learning interpretability becomes critical for data scientists and business decision-makers alike. Data scientists need the ability to explain machine-learning models to executives and stakeholders to verify hypotheses and build trust in the models. One of the three highlighted papers used Shapley additive explanations, which is a game-theory-based approach to explain machine-learning outputs, to provide a layer of interpretability to their machine-learning model for identification of identification of geomechanical facies along horizontal wells. A cautionary note: While there is significant promise in applying these techniques, there remain many challenges in capitalizing on the data—lack of common data models in the industry, data silos, data stored in on-premises resources, slow migration of data to the cloud, legacy databases and systems, lack of digitization of older/legacy reports, well logs, and lack of standardization in data-collection methodologies across different facilities and geomarkets, to name a few. I would like to invite readers to review the selection of papers to get an idea of various applications in the upstream oil and gas space where machine-learning methods have been leveraged. The highlighted papers cover the topics of fatigue dam-age of marine risers and well performance optimization and identification of frackable, brittle, and producible rock along horizontal wells using drilling data. Recommended additional reading at OnePetro: www.onepetro.org. SPE 201597 - Improved Robustness in Long-Term Pressure-Data Analysis Using Wavelets and Deep Learning by Dante Orta Alemán, Stanford University, et al. SPE 202379 - A Network Data Analytics Approach to Assessing Reservoir Uncertainty and Identification of Characteristic Reservoir Models by Eugene Tan, the University of Western Australia, et al. OTC 30936 - Data-Driven Performance Optimization in Section Milling by Shantanu Neema, Chevron, et al.

Leveraging Oil and Gas Data Lakes to Enable Data Science Factories

From Insight to Foresight: Knowing How to Apply Artificial Intelligence in the Oil & Gas Industry

Design and Deployment of a Data Lake at a Pilot Plant Scale for a Smart Electropolishing Process

Unleashing Industry 4.0 Opportunities: Big Data Analytics in the Midstream Oil & Gas Sector

How to use Big Data technologies to optimize operations in Upstream Petroleum Industry

Data Journey: Digitalization Projects Deliver Returns for Operators

Petroleum Industry Value Chain Optimization: the Inevitability of Artificial Intelligence and Data Science in Midstream and Downstream Development

Toward data lakes as central building blocks for data management and analysis

The Data Lakehouse: Data Warehousing and More

A Big Data Lake for Multilevel Streaming Analytics

On data lake architectures and metadata management

A Framework of Best Practices for Delivering Successful Artificial Intelligence Projects. A Case Study Demonstration

Proceedings of the Workshop on Data Mining for Oil and Gas

A Zone-Based Data Lake Architecture for IoT, Small and Big Data

Manufacturing process data analysis pipelines: a requirements analysis and survey

Oil Companies Demand Digital Their Way

Technology Focus: Data Analytics (October 2021)

Data Lake Ingestion Management

Dataversifying Natural Sciences: Pioneering a Data Lake Architecture for Curated Data-Centric Experiments in Life \& Earth Sciences

An Overview of Data Warehouse and Data Lake in Modern Enterprise Data Management

Manipulating Data Lakes Intelligently With Java Annotations