Elucidating Structures of Complex Organic Compounds Using a Machine Learning Model Based on the 13C NMR Chemical Shifts
Anan Wu,Qing Ye,Xiaowei Zhuang,Qiwen Chen,Jinkun Zhang,Jianming Wu,Xin Xu
DOI: https://doi.org/10.1021/prechem.3c00005
2023-01-01
Abstract:We present a protocol that combines the support vector machine (SVM) model with accurate C-13 chemical shift calculations at the xOPBE/6-311+G(2d,p) level of theory, denoted as SVM-M (i.e., SVM for magnetic property). We show here that this SVM-M protocol is a versatile tool for identifying the structural and stereochemical assignment of complex organic compounds with high confidence. Of particular significance is that, by utilizing the dual role of the decision values in SVM, the present SVM-M protocol provides an accurate yet efficient solution to simultaneously handle the classification issue (i.e., "is a given structure correct or incorrect?") and the comparison-based problem (i.e., "which structure is more likely to be correct or wrong among several candidate structures?"). A significantly high success rate has been reached (i.e., similar to 100% on a set of 760 sample molecules with 15928 C-13 chemical shifts), which makes the SVM-M protocol a powerful tool for routine applications in structural and stereochemical assignments, as well as in detecting mis-assignments, for complex organic compounds, including natural products.