Model-Based Microbiome Data Ordination: A Variational Approximation Approach.

Yanyan Zeng,Hongyu Zhao,Tao Wang
DOI: https://doi.org/10.1080/10618600.2021.1882467
2021-01-01
Journal of Computational and Graphical Statistics
Abstract:The coevolution between human and bacteria colonizing the human body has profound implications for heath and development, with a growing body of evidence linking the altered microbiome composition with a wide array of disease states. Yet dimension reduction and visualization analysis of microbiome data are still in their infancy and many challenges exist. In this article, we introduce a general framework, zero-inflated probabilistic principal component analysis (ZIPPCA), for dimension reduction and data ordination of multivariate abundance data, and propose an efficient variational approximation method for estimation, inference, and prediction. Extensive simulations show that the proposed method outperforms algorithm-based methods and compares favorably with existing model-based methods. We further apply our method to a gut microbiome dataset for visualization analysis of community composition across age and geography. The method is implemented in R and available at https://github.com/YanyZeng/ZIPPCA.
What problem does this paper attempt to address?