CAT: a conditional association test for microbiome data using a leave-out approach

Yushu Shi,Liangliang Zhang,Kim-Anh Do,Robert R. Jenq,Christine B. Peterson
DOI: https://doi.org/10.48550/arXiv.2309.08109
2023-09-15
Abstract:In microbiome analysis, researchers often seek to identify taxonomic features associated with an outcome of interest. However, microbiome features are intercorrelated and linked by phylogenetic relationships, making it challenging to assess the association between an individual feature and an outcome. Researchers have developed global tests for the association of microbiome profiles with outcomes using beta diversity metrics which offer robustness to extreme values and can incorporate information on the phylogenetic tree structure. Despite the popularity of global association testing, most existing methods for follow-up testing of individual features only consider the marginal effect and do not provide relevant information for the design of microbiome interventions. This paper proposes a novel conditional association test, CAT, which can account for other features and phylogenetic relatedness when testing the association between a feature and an outcome. CAT adopts a leave-out method, measuring the importance of a feature in predicting the outcome by removing that feature from the data and quantifying how much the association with the outcome is weakened through the change in the coefficient of determination. By leveraging global tests including PERMANOVA and MiRKAT-based methods, CAT allows association testing for continuous, binary, categorical, count, survival, and correlated outcomes. Our simulation and real data application results illustrate the potential of CAT to inform the design of microbiome interventions aimed at improving clinical outcomes.
Methodology
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to test the association between a single feature and the outcome in microbiome data analysis, on the basis of considering other features and phylogenetic correlations. Specifically, researchers often hope to identify taxonomic features related to the outcome of interest. However, due to the inter - correlations among microbiome features and the influence of phylogenetic relationships, evaluating the association between a single feature and the outcome becomes very challenging. Most of the existing methods only consider marginal effects and cannot provide relevant information for designing microbiome interventions. Therefore, this paper proposes a new conditional association test method - CAT (Conditional Association Test), which overcomes the limitations of existing methods by excluding a certain feature to measure the change in its importance to the predicted outcome. The CAT method can not only handle continuous, binary, categorical, count, survival and related outcomes, but also has verified its effectiveness and practicality through simulation studies and practical applications.