The Critical Beta-splitting Random Tree II: Overview and Open Problems
David J. Aldous,Svante Janson
2024-07-06
Abstract:In the critical beta-splitting model of a random $n$-leaf rooted tree, clades are recursively (from the root) split into sub-clades, and a clade of $m$ leaves is split into sub-clades containing $i$ and $m-i$ leaves with probabilities $\propto 1/(i(m-i))$. Study of structure theory and explicit quantitative aspects of the model is an active research topic, and this article provides an extensive overview what is currently known. For many results there are different proofs, probabilistic or analytic, so the model provides a testbed for a ``compare and contrast" discussion of techniques. We give some proofs that are not currently available elsewhere, and also give heuristics for some proven results and for some open problems. Our discussion is centered around three ``foundational" results.
(i) There is a canonical embedding into a continuous-time model, that is a random tree $\mbox{CTCS}(n)$ on $n$ leaves with real-valued edge lengths, and this model turns out more convenient to study. The family $(\mbox{CTCS}(n), n \ge 2)$ is consistent under a ``delete random leaf and prune" operation. That leads to an explicit inductive construction of $(\mbox{CTCS}(n), n \ge 2)$ as $n$ increases, and then to a limit structure $\mbox{CTCS}(\infty)$ formalized via exchangeable partitions, in some ways analogous to the Brownian continuum random tree.
(ii) There is a CLT for leaf heights, and the analytic proof can be extended to provide surprisingly precise analysis of other height-related aspects.
(iii) There is an explicit description of the limit {\em fringe distribution} relative to a random leaf, whose graphical representation is essentially the format of the cladogram representation of biological phylogenies.
Many open problems remain.
Probability,Combinatorics,Populations and Evolution