Molecular descriptor-enhanced graph neural network for energetic molecular property prediction

Tianyu Gao,Yujin Ji,Cheng Liu,Youyong Li
DOI: https://doi.org/10.1007/s40843-023-2848-8
2024-03-20
Science China Materials
Abstract:Energetic molecules (EMs) play an important role in both military and civilian applications. Traditionally, determining the physicochemical parameters of EMs requires experimental workload and inherent risks while new-rising machine learning (ML) methods are promising to address this challenge. In this work, we report a molecular descriptor-enhanced graph neural network (MD-enhanced GNN) model to accurately and fast predict three detonation parameters of EMs. This model integrates sequence-based molecular descriptors and structure-based graph vectors, offering a comprehensive framework that does not require custom descriptors. Accordingly, we construct an EMs dataset that includes 18,991 CHNO EMs and compare our model with sole molecular fingerprint/descriptor and GNN methods. It is found that our proposed MD-enhanced GNN integration method achieves superior accuracy with R 2 over 0.93 and a learning speed improvement of over 20% by combining two different complementary features, which highlights the potential of our model in reshaping the landscape of EMs design, promising substantial improvements in both efficiency and effectiveness within this critical field.
materials science, multidisciplinary
What problem does this paper attempt to address?
This paper discusses the use of Molecular Descriptor-enhanced Graph Neural Networks (MD-enhanced GNNs) to predict three key detonation parameters of energetic molecules (EMs): detonation heat, detonation velocity, and detonation pressure. Traditional methods for determining these parameters require extensive experiments, while machine learning (ML) offers a promising alternative. In this study, the authors constructed a database containing 18,991 CHNO EMs and compared the prediction performance using only molecular fingerprints or descriptors, as well as using only GNN methods. The results show that the MD-enhanced GNN model combining both types of features achieves higher accuracy (R² exceeding 0.93) and a 20% improvement in learning speed. The paper first introduces the importance of energetic molecules in military and civilian applications, as well as the limitations of traditional design methods. It then describes in detail the data collection process, including the screening of molecules from a database, computation of their physicochemical parameters, and calculation of detonation performance parameters based on the Kamlet-Jacobs equation. Additionally, different methods for feature representation, such as molecular fingerprints and MDs, as well as the use of GNNs, are mentioned. In the methodology section, the paper elaborates on how molecules are transformed into graph structures and trained using different ML algorithms (such as random forests and gradient boosting decision trees) as well as GNNs (such as GCN, GATv2, and CGCNN). The proposed MD-enhanced GNN model combines molecular fingerprints and MDs with graph neural networks to improve prediction performance. In the results section, the paper demonstrates the performance comparison of different models and feature combinations, finding that the MD-enhanced GATv2 model shows the best prediction performance on all three parameters, with the lowest root mean square error (RMSE) and the highest determination coefficient (R²). Furthermore, the SHAP value analysis indicates the critical importance of MDs in improving the prediction performance of GNNs. In summary, this paper addresses the problem of accurately and efficiently predicting the detonation performance of energetic molecules by integrating multiple features and GNNs, providing new tools and insights for the design of energetic materials.