Advancing Vapor Pressure Prediction: A Machine Learning Approach with Directed Message Passing Neural Networks

Yen-Hsiang Lin,Yi-Pei Li,Hsin-Hao Liang, Shiang-Tai Lin
DOI: https://doi.org/10.26434/chemrxiv-2024-nmnlk
2024-03-15
Abstract:The knowledge vapor pressure of a chemical as a function of temperatures is important in many chemical and environmental engineering applications. This study introduces a novel approach utilizing a machine learning model based on the directed message passing neural network (D-MPNN) architecture to predict the vapor pressure of organic molecules over a broad temperature spectrum. We investigate various strategies for incorporating temperature effects into our models, a key factor for accurate vapor pressure predictions. Our results show that the D-MPNN model markedly surpasses the traditional PR + COSMOSAC method, achieving a significantly lower average absolute relative deviation (AARD) of 0.617 (from D-MPNN vs. 1.36 from PR + COSMOSAC) for an extensive dataset of 19,081 molecules. This improvement is notable as it does not require additional critical property measurements or quantum mechanical calculations for the molecules. This study underscores the potential of machine learning to accurately capture complex molecular features for reliable vapor pressure prediction, presenting a robust alternative to traditional methods dependent on critical property data or quantum mechanical calculations. This breakthrough is especially advantageous for assessing the properties of a novel or under-characterized chemical species.
Chemistry
What problem does this paper attempt to address?
This paper focuses on how to predict the vapor pressure of chemical substances more accurately, which is an important problem in various fields such as chemistry and environmental engineering. The research team proposes a machine learning method based on Directed Message Passing Neural Network (D-MPNN) to predict the vapor pressure of organic molecules over a wide range of temperatures. They explore different strategies for incorporating temperature effects into the model to improve prediction accuracy. Traditional prediction methods, such as the cubic state equation and quantum mechanical calculations, may rely on experimental data or specific chemical properties, limiting their applicability to new discoveries or high-temperature unstable compounds. The D-MPNN model significantly outperforms the traditional PR+COSMOSAC method, with an Average Absolute Relative Deviation (AARD) reduced to 0.617, without the need for additional key property measurements or quantum mechanical calculations, while the AARD for PR+COSMOSAC is 1.36. The paper also introduces two model architectures that integrate temperature effects: Equation Embedding (EE) model, which predicts the coefficients of empirical equations through machine learning, and Temperature Concatenation (TC) model, which directly combines temperature information with molecular fingerprints. Using these two approaches, the researchers evaluate how to best utilize both molecular structure and temperature data for accurate vapor pressure prediction. Experimental results show that the D-MPNN model outperforms other methods in predicting performance under different temperature conditions, particularly when dealing with new molecular structures without relying on experimental data or complex quantum calculations. This work highlights the potential of machine learning in capturing complex molecular properties and provides a powerful and efficient choice for vapor pressure prediction, especially in evaluating the properties of new or underrepresented chemical species.