Graph-Aligned Random Partition Model (GARP)

Giovanni Rebaudo,Peter Müller
DOI: https://doi.org/10.1080/01621459.2024.2353943
IF: 4.369
2024-06-01
Journal of the American Statistical Association
Abstract:Bayesian nonparametric mixtures and random partition models are powerful tools for probabilistic clustering. However, standard independent mixture models can be restrictive in some applications such as inference on cell lineage due to the biological relations of the clusters. The increasing availability of large genomic data requires new statistical tools to perform model-based clustering and infer the relationship between homogeneous subgroups of units. Motivated by single-cell RNA data we develop a novel dependent mixture model to jointly perform cluster analysis and align the clusters on a graph. Our flexible graph-aligned random partition model (GARP) exploits Gibbs-type priors as building blocks, allowing us to derive analytical results for the probability mass function (pmf) on the graph-aligned random partition. We derive a generalization of the Chinese restaurant process from the pmf and a related efficient and neat MCMC algorithm to implement Bayesian inference. We illustrate posterior inference under the GARP using single-cell RNA-seq data from mice stem cells. We further investigate the performance of the model in recovering the underlying clustering structure as well as the underlying graph by means of simulation studies. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.
statistics & probability
What problem does this paper attempt to address?