The distributions under two species-tree models of the total number of ancestral configurations for matching gene trees and species trees

Filippo Disanto,Michael Fuchs,Chun-Yen Huang,Ariel R. Paningbatan,Noah A. Rosenberg
DOI: https://doi.org/10.1016/j.aam.2023.102594
IF: 1.271
2024-01-01
Advances in Applied Mathematics
Abstract:Given a gene-tree labeled topology G and a species tree S, the ancestral configurations at an internal node k of S represent the combinatorially different sets of gene lineages that can be present at k when all possible realizations of G in S are considered. Ancestral configurations have been introduced as a data structure for evaluating the conditional probability of a gene-tree labeled topology given a species tree, and their enumeration assists in describing the complexity of this computation. In the case that the gene-tree labeled topology G = t matches that of the species tree S, by techniques of analytic combinatorics, we study distributional properties of the total number of ancestral configurations measured across the different nodes of a random labeled topology t selected under the uniform and the Yule probability models. Under both of these probabilistic scenarios, we show that the total number T n of ancestral configurations of a random labeled topology of n taxa asymptotically follows a lognormal distribution. Over uniformly distributed labeled topologies, the asymptotic growth of the mean and the variance of T n are found to satisfy E U [ T n ] ∼ 2.449 ⋅ 1.333 n and V U [ T n ] ∼ 5.050 ⋅ 1.822 n , respectively. Under the Yule model, which assigns higher probabilities to more balanced labeled topologies, we obtain the mean E Y [ T n ] ∼ 1.425 n and the variance V Y [ T n ] ∼ 2.045 n .
mathematics, applied
What problem does this paper attempt to address?