Accurate Prediction of NMR Chemical Shifts: Integrating DFT Calculations with Three-Dimensional Graph Neural Networks

Chao Han,Dongdong Zhang,Song Xia,Yingkai Zhang
DOI: https://doi.org/10.1021/acs.jctc.4c00422
2024-06-07
Journal of Chemical Theory and Computation
Abstract:Computer prediction of NMR chemical shifts plays an increasingly important role in molecular structure assignment and elucidation for organic molecule studies. Density functional theory (DFT) and gauge-including atomic orbital (GIAO) have established a framework to predict NMR chemical shifts but often at a significant computational expense with a limited prediction accuracy. Recent advancements in deep learning methods, especially graph neural networks (GNNs), have shown promise in improving...
chemistry, physical,physics, atomic, molecular & chemical
What problem does this paper attempt to address?
This paper aims to address the issue of accurate prediction of nuclear magnetic resonance (NMR) chemical shifts. NMR chemical shifts are crucial for molecular structure determination and elucidation in organic molecule research. However, traditional density functional theory (DFT) computational methods, although providing a predictive framework, suffer from high computational costs and limited accuracy. In recent years, deep learning, especially graph neural networks (GNNs), has shown potential in improving the accuracy of experimental chemical shift predictions. The researchers have developed a new 3D GNN model called CSTShift, which combines atomic features and DFT-calculated shielding tensor descriptors to capture isotropic and anisotropic shielding effects. Utilizing the NMRShiftDB2 dataset and DFT optimization and GIAO calculations at the B3LYP/6-31G(d) level, they created a high-quality 3D structure and shielding tensor dataset. The CSTShift model exhibits state-of-the-art predictive performance on the NMRShiftDB2-DFT test dataset and the external CHESHIRE dataset. By comparing different models, including a 3D GNN without the CST descriptors, the paper demonstrates the necessity of incorporating molecular geometry information for improving predictive accuracy. The CSTShift model also demonstrates advantages in structure elucidation tasks, such as correctly identifying structures from constitutional isomers. The study also emphasizes the potential future need to reduce DFT computational costs to expand the model's applicability and handle larger-scale datasets.