Ordered and size-biased frequencies in GEM and Gibbs models for species sampling

Jim Pitman,Yuri Yakubovich
DOI: https://doi.org/10.48550/arXiv.1704.04732
2017-08-26
Abstract:We describe the distribution of frequencies ordered by sample values in a random sample of size $n$ from the two parameter GEM$(\alpha,\theta)$ random discrete distribution on the positive integers. These frequencies are a $($size$-\alpha)$-biased random permutation of the sample frequencies in either ranked order, or in the order of appearance of values in the sampling process. This generalizes a well known identity in distribution due to Donnelly and Tavaré (1986) for $\alpha = 0$ to the case $0 \le \alpha < 1$. This description extends to sampling from Gibbs$(\alpha)$ frequencies obtained by suitable conditioning of the GEM$(\alpha,\theta)$ model, and yields a value-ordered version of the Chinese Restaurant construction of GEM$(\alpha,\theta)$ and Gibbs$(\alpha)$ frequencies in the more usual size-biased order of their appearance. The proofs are based on a general construction of a finite sample $(X_1,\dots,X_n)$ from any random frequencies in size-biased order from the associated exchangeable random partition $\Pi_\infty$ of $\mathbb{N}$ which they generate.
Probability
What problem does this paper attempt to address?