Two-sample Tests of High-Dimensional Means for Compositional Data

Yuanpei Cao,Wei Lin,Hongzhe Li
DOI: https://doi.org/10.1093/biomet/asx060
IF: 3.0279
2017-01-01
Biometrika
Abstract:Compositional data are ubiquitous in many scientific endeavours. Motivated by microbiome and metagenomic research, we consider a two-sample testing problem for high-dimensional compositional data and formulate a testable hypothesis of compositional equivalence for the means of two latent log basis vectors. We propose a test through the centred log-ratio transformation of the compositions. The asymptotic null distribution of the test statistic is derived and its power against sparse alternatives is investigated. A modified test for paired samples is also considered. Simulations show that the proposed tests can be significantly more powerful than tests that are applied to the raw and log-transformed compositions. The usefulness of our tests is illustrated by applications to gut microbiome composition in obesity and Crohn's disease.
What problem does this paper attempt to address?