Abstract:Nuclear magnetic resonance (NMR) spectroscopy plays an essential role in deciphering molecular structure and dynamic behaviors. While AI-enhanced NMR prediction models hold promise, challenges still persist in tasks such as molecular retrieval, isomer recognition, and peak assignment. In response, this paper introduces a novel solution, Multi-Level Multimodal Alignment with Knowledge-Guided Instance-Wise Discrimination (K-M3AID), which establishes correspondences between two heterogeneous modalities: molecular graphs and NMR spectra. K-M3AID employs a dual-coordinated contrastive learning architecture with three key modules: a graph-level alignment module, a node-level alignment module, and a communication channel. Notably, K-M3AID introduces knowledge-guided instance-wise discrimination into contrastive learning within the node-level alignment module. In addition, K-M3AID demonstrates that skills acquired during node-level alignment have a positive impact on graph-level alignment, acknowledging meta-learning as an inherent property. Empirical validation underscores K-M3AID's effectiveness in multiple zero-shot tasks.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the peak assignment problem in nuclear magnetic resonance (NMR) spectroscopy analysis, especially in 13C NMR spectroscopy. Specifically, the author points out that although the existing AI - enhanced NMR prediction models show potential in generating spectra, there are still challenges in tasks such as molecule retrieval, isomer identification, and peak assignment. These challenges are mainly manifested as follows:
1. **High error tolerance**: The existing models have a high error tolerance for peak assignment and lack precise point - to - point guidance.
2. **Lack of quantitative ranking**: The existing models are insufficient in achieving quantitative ranking of candidate isomers.
3. **Requirement for prior knowledge**: The success of the existing models depends on prior knowledge of molecular structures, but in practical applications, especially in the identification process of unknown compounds, detailed structural information is often lacking.
To solve these problems, the author proposes a new method - Knowledge - Guided Multi - Level Multimodal Alignment with Instance - Wise Discrimination (K - M3AID). By establishing the correspondence between molecular graphs and NMR spectra, K - M3AID aims to improve the accuracy of molecule retrieval, candidate ranking, and peak assignment, and it performs particularly well in zero - sample tasks.
### Specific problems
1. **Molecule retrieval**: How to quickly and accurately retrieve the target molecule from a large number of molecular libraries without detailed structural information.
2. **Isomer identification**: How to accurately identify the target compound among multiple structurally similar isomers.
3. **Peak assignment**: How to accurately identify the atomic position corresponding to each peak in the NMR spectrum, especially in complex molecules and isomers.
### Solutions
K - M3AID solves the above problems through the following three key modules:
1. **Graph - level alignment module**: Establish the correspondence between the molecular graph and the 13C NMR spectrum, and use cross - entropy loss for contrastive learning.
2. **Node - level alignment module**: Align each carbon atom in the molecular graph with its signal peak in the spectrum, and introduce a knowledge - guided instance discrimination mechanism.
3. **Communication channel**: Dynamically adjust the gradient flow between the node encoder and the graph encoder to promote collaborative training between the two modules.
### Main contributions
1. **Conceptual level**: Integrate graph - level and node - level cross - modal alignment in the K - M3AID framework, improving the learning efficiency of zero - sample tasks.
2. **Methodological level**: Introduce a knowledge - guided instance discrimination mechanism, utilize continuous and domain - specific features, and transform discrete comparison into a continuous paradigm.
3. **Empirical level**: Verify the effectiveness of K - M3AID through successful applications in various zero - sample tasks, including molecule retrieval, isomer identification, and peak assignment.
### Experimental results
The experimental results show that K - M3AID exhibits significant advantages in various zero - sample tasks, especially in molecule retrieval and isomer identification tasks, where its performance is significantly better than other baseline models. In addition, in the peak assignment task, K - M3AID also shows high accuracy and stability.
In conclusion, this paper effectively solves the peak assignment problem in NMR spectroscopy analysis by proposing the K - M3AID framework and demonstrates excellent performance in various zero - sample tasks.