Explainable AI to Facilitate Understanding of Neural Network-Based Metabolite Profiling Using NMR Spectroscopy

Hayden Johnson,Aaryani Tipirneni-Sajja
DOI: https://doi.org/10.3390/metabo14060332
IF: 4.1
2024-06-15
Metabolites
Abstract:Neural networks (NNs) are emerging as a rapid and scalable method for quantifying metabolites directly from nuclear magnetic resonance (NMR) spectra, but the nonlinear nature of NNs precludes understanding of how a model makes predictions. This study implements an explainable artificial intelligence algorithm called integrated gradients (IG) to elucidate which regions of input spectra are the most important for the quantification of specific analytes. The approach is first validated in simulated mixture spectra of eight aqueous metabolites and then investigated in experimentally acquired lipid spectra of a reference standard mixture and a murine hepatic extract. The IG method revealed that, like a human spectroscopist, NNs recognize and quantify analytes based on an analyte's respective resonance line-shapes, amplitudes, and frequencies. NNs can compensate for peak overlap and prioritize specific resonances most important for concentration determination. Further, we show how modifying a NN training dataset can affect how a model makes decisions, and we provide examples of how this approach can be used to de-bug issues with model performance. Overall, results show that the IG technique facilitates a visual and quantitative understanding of how model inputs relate to model outputs, potentially making NNs a more attractive option for targeted and automated NMR-based metabolomics.
biochemistry & molecular biology
What problem does this paper attempt to address?
This paper mainly discusses how to use Explainable Artificial Intelligence (XAI) to enhance understanding of the analysis of nuclear magnetic resonance (NMR) spectroscopy metabolites based on neural networks. In the study, the authors implemented an interpretable AI algorithm called Integrated Gradients (IG) to determine which regions of the input spectrum are most important for the quantification of specific compounds. First, the method was validated in simulated mixture spectra, and then it was applied to the spectra of lipid standard mixtures and mouse liver extracts obtained in experiments. The paper points out that although neural networks have the potential for rapid and large-scale quantification of metabolites in NMR spectroscopy, their non-linear nature makes it difficult to understand the model's prediction behavior. Through the IG method, the study found that the neural network, like a human spectroscopist, identifies and quantifies compounds based on resonance line shapes, amplitudes, and frequencies. Additionally, the neural network can compensate for overlapping peaks and prioritize resonances that are most important for concentration determination. The paper also demonstrates how modifying the neural network training dataset can influence model decisions and provides examples of debugging model performance issues. The research results show that IG technology can promote visual and quantitative understanding of the relationship between model inputs and outputs, making neural networks a more attractive choice for targeted and automated NMR metabolomics.