Improved Prediction of Carbonless NMR Spectra by the Machine Learning of Theoretical and Fragment Descriptors for Environmental Mixture Analysis

Kengo Ito,Xiangru Xu,Jun Kikuchi
DOI: https://doi.org/10.1021/acs.analchem.1c00756
2021-05-11
Abstract:As the first multidimensional NMR approach, 2D J-resolved (2DJ) spectroscopy is distinguished by signal resolution and detection sensitivity with remarkable advantages for the exhaustive evaluation of complex mixtures and environmental samples due to its carbonless feature without the requirement of 13C connectivity. Generally, the 2DJ signal assignment of metabolic mixtures is problematic in spite of references to experimental NMR databases, owing to the existence of metabolic "dark matter." In this study, a new method to predict 2DJ spectra was developed with a combination of quantum mechanical (QM) computation and machine learning (ML). The predictive accuracy of J-coupling constants was evaluated using validated data. The root-mean-square deviation (RMSD) for QM computation was 3.52 Hz, while the RMSD for QM + ML was 1.21 Hz, indicating a substantial increase in predictive accuracy. The proposed model was applied to predict the 2DJ spectra of 60 standard substances and 55 components of seawater. Furthermore, two practical environmental samples were used to evaluate the robustness of the constructed predictive model. A J-coupling tree and J-split spectra produced from QM + ML of aliphatic moieties had good consistency with the experimental data, as compared with the theoretical data produced by QM computation. The predicted J-coupling tree for the J-coupling multiplet analysis of freely rotating bonds in the complex mixture, which is traditionally difficult, was interpretable. In addition, in silico identification of the J-split 1H NMR signals, which was independent of experimental databases, aided in the discovery of new components in a mixture.
What problem does this paper attempt to address?