Abstract:Experimental data from different sources present challenges due to variability and noise from various experimental conditions, apparatuses, and environmental factors. In this work, we propose a general method to address these challenges to build a consistent data set employing different thermal conductivity experimental data sets of methane from the liquid, vapor, and supercritical phases. Methane is a key hydrocarbon with extensive industrial and environmental applications. The method is based on machine learning (ML) techniques, which are used to consistently integrate data from various experimental sources compiled by the National Institute of Standards and Technology (NIST) database. Different ML algorithms are used for this purpose. Our findings indicate that ML models trained on raw experimental data yield predictions closer to the NIST’s processed data than the original raw experimental data, thus demonstrating the models’ ability to generalize from heterogenous, noisy, and untreated data sets. The proposed ML approach is general and efficient in handling complex and heterogeneous data to deliver reliable predictions without extensive preprocessing.

What problem does this paper attempt to address?

This paper mainly investigates how to use machine learning (ML) methods to construct a consistent dataset to handle thermal conductivity data from different experiments. The research team focuses on thermal conductivity data of methane (a crucial hydrocarbon compound in industrial and environmental applications) in liquid, gas, and supercritical states. Due to the possible variations, noise, and challenges caused by different experimental conditions, devices, and environmental factors, they propose a general approach to integrate data from various experimental sources compiled in the National Institute of Standards and Technology (NIST) database using ML techniques. The paper points out that although the original experimental data may contain noise, the trained ML models can generate predictions closer to the NIST-processed data, demonstrating the ability of the models to learn intrinsic patterns from non-uniform and unprocessed datasets. The researchers utilize various ML algorithms and employ a decision tree model to classify the physical states of experimental data (liquid, gas, or supercritical) based on temperature and pressure variables. The results indicate that the predictions of the ML models are more consistent with the NIST-processed data compared to the original experimental data, suggesting the effectiveness and reliability of this method for handling complex and heterogeneous datasets without extensive preprocessing steps. This provides a new approach for establishing reliable and consistent thermal property datasets, especially in applications requiring accurate thermal management, such as various methane applications. In summary, the problem addressed in this paper is how to utilize machine learning techniques to create a consistent and reliable thermal conductivity dataset from experimental data from different sources, particularly for essential substances like methane, which is of significant importance in industrial process optimization, energy efficiency improvement, and environmental impact assessment.

A systematic and general machine learning approach to build a consistent data set from different experiments

A consistent set of thermophysical properties of methane curated with machine learning

Estimating Air Methane and Total Hydrocarbon Concentrations in Alberta, Canada Using Machine Learning

Exploring Machine Learning Techniques for Accurate Prediction of Methane Hydrate Formation Temperature in Brine: A Comparative Study

Addressing Low-Cost Methane Sensor Calibration Shortcomings with Machine Learning

Machine learning prediction of methane, nitrogen, and natural gas mixture viscosities under normal and harsh conditions

Twofold Machine-Learning and Molecular Dynamics: A Computational Framework

Machine Learning for Methane Detection and Quantification from Space -- A survey

Gas permeability, diffusivity, and solubility in polymers: Simulation-experiment data fusion and multi-task machine learning

Machine learning approach to map the thermal conductivity of over 2,000 neoteric solvents for green energy storage applications

Solubility of Methane in Ionic Liquids for Gas Removal Processes Using a Single Multilayer Perceptron Model

A hybrid molecular dynamics/machine learning framework to calculate the viscosity and thermal conductivity of Ar, Kr, Xe, O and Ν

Machine Learning for Accurate Methane Concentration Predictions: Short-Term Training, Long-Term Results

An Automated Machine Learning architecture for the accelerated prediction of Metal-Organic Frameworks performance in energy and environmental applications

Machine learning model for non-equilibrium structures and energies of simple molecules

Machine learning‐aided process design using limited experimental data: A microwave‐assisted ammonia synthesis case study

Machine Learning Predictions of Methane Storage in MOFs: Diverse Materials, Multiple Operating Conditions, and Reverse Models

Transferability of datasets between Machine-Learning Interaction Potentials

Prediction model for methanation reaction conditions based on a state transition simulated annealing algorithm optimized extreme learning machine

A Universal Machine Learning Algorithm for Large-Scale Screening of Materials

Prediction of Key Parameters in the Design of CO2 Miscible Injection via the Application of Machine Learning Algorithms