MMETHANE: interpretable AI for predicting host status from microbial composition and metabolomics data

Jennifer J Dawkins,Georg K Gerber
DOI: https://doi.org/10.1101/2024.12.13.628441
2024-12-14
Abstract:Metabolite production, consumption, and exchange are intimately involved with host health and disease, as well as being key drivers of host-microbiome interactions. Despite the increasing prevalence of datasets that jointly measure microbiome composition and metabolites, computational tools for linking these data to the status of the host remain limited. To address these limitations, we developed MMETHANE, an open-source software package that implements a purpose-built deep learning model for predicting host status from paired microbial sequencing and metabolomic data. MMETHANE incorporates prior biological knowledge, including phylogenetic and chemical relationships, and is intrinsically interpretable, outputting an English-language set of rules that explains its decisions. Using a compendium of six datasets with paired microbial composition and metabolomics measurements, we showed that MMETHANE always performed at least on par with existing methods, including blackbox machine learning techniques, and outperformed other methods on >80% of the datasets evaluated. We additionally demonstrated through two cases studies analyzing inflammatory bowel disease gut microbiome datasets that MMETHANE uncovers biologically meaningful links between microbes, metabolites, and disease status.
Microbiology
What problem does this paper attempt to address?