DeepPhylo: Phylogeny‐Aware Microbial Embeddings Enhanced Predictive Accuracy in Human Microbiome Data Analysis

Bin Wang,Yulong Shen,Jingyan Fang,Xiaoquan Su,Zhenjiang Zech Xu
DOI: https://doi.org/10.1002/advs.202404277
IF: 15.1
2024-10-17
Advanced Science
Abstract:This study introduces DeepPhylo, a novel method that integrates abundance and phylogenetic information using phylogeny‐aware amplicon embeddings, enhancing both unsupervised discriminatory power and supervised predictive accuracy in microbiome data analysis. DeepPhylo outperforms existing methods across five real‐world microbiome applications, including clustering skin microbiomes, predicting host age and gender, diagnosing inflammatory bowel disease, and multilabel disease classification. Microbial data analysis poses significant challenges due to its high dimensionality, sparsity, and compositionality. Recent advances have shown that integrating abundance and phylogenetic information is an effective strategy for uncovering robust patterns and enhancing the predictive performance in microbiome studies. However, existing methods primarily focus on the hierarchical structure of phylogenetic trees, overlooking the evolutionary distances embedded within them. This study introduces DeepPhylo, a novel method that employs phylogeny‐aware amplicon embeddings to effectively integrate abundance and phylogenetic information. DeepPhylo improves both the unsupervised discriminatory power and supervised predictive accuracy of microbiome data analysis. Compared to the existing methods, DeepPhylo demonstrates superiority in informing biologically relevant insights across five real‐world microbiome use cases, including clustering of skin microbiomes, prediction of host chronological age and gender, diagnosis of inflammatory bowel disease (IBD) across 15 studies, and multilabel disease classification.
materials science, multidisciplinary,nanoscience & nanotechnology,chemistry
What problem does this paper attempt to address?