PhyloFunc: Phylogeny-informed Functional Distance as a New Ecological Metric for Metaproteomic Data Analysis

Luman Wang,Caitlin M.A. Simopoulos,Joeselle M. Serrana,Zhibin Ning,Yutong Li,Boyan Sun,Jinhui Yuan,Daniel Figeys,Leyuan Li
DOI: https://doi.org/10.1101/2024.05.28.596184
2024-09-23
Abstract:Background: Beta-diversity is a fundamental ecological metric for exploring dissimilarities between microbial communities. On the functional dimension, metaproteomics data can be used to quantify beta-diversity to understand how microbial community functional profiles vary under different environmental conditions. Conventional approaches to metaproteomic functional beta diversity often treat protein functions as independent features, ignoring the evolutionary relationships among microbial taxa from which different proteins originate. A more informative functional distance metric that incorporates evolutionary relatedness is needed to better understand microbiome functional dissimilarities. Results: Here, we introduce PhyloFunc, a novel functional beta-diversity metric that incorporates microbiome phylogeny to inform on metaproteomic functional distance. Leveraging the phylogenetic framework of weighted UniFrac distance, PhyloFunc innovatively utilizes branch lengths to weigh between-sample functional distances for each taxon, rather than differences in taxonomic abundance as in weighted UniFrac. Proof-of-concept using a simulated toy dataset and a real dataset from mouse inoculated with a synthetic gut microbiome and fed different diets show that PhyloFunc successfully captured functional compensatory effects between phylogenetically related taxa. We further tested a third dataset of complex human gut microbiomes treated with five different drugs to compare PhyloFunc's performance with other traditional distance methods. PCoA and machine learning-based classification algorithms revealed higher sensitivity of PhyloFunc in microbiome responses to paracetamol. We offer PhyloFunc as an open-source Python package (https://pydigger.com/pypi/PhyloFunc) that efficiently calculates functional beta-diversity distances between sample pairs and generates distance matrices for all samples. Conclusions: Unlike traditional approaches that consider metaproteomics features as independent and unrelated, PhyloFunc acknowledges the role of phylogenetic context in shaping the functional landscape in metaproteomes. In particular, we report that PhyloFunc accounts for the functional compensatory effect of taxonomically related species. It is effective, ecologically significant, and has better sensitivity, as evidenced by the particular applications we presented.
Bioinformatics
What problem does this paper attempt to address?