Accelerated Bayesian inference of population size history from recombining sequence data

Jonathan Terhorst
DOI: https://doi.org/10.1101/2024.03.25.586640
2024-03-27
Abstract:I present , a new Bayesian method for inferring population history from whole genome sequence data. is opulation istory earning by veraging ampled istories: it works by drawing random, low-dimensional projections of the coalescent intensity function from the posterior distribution of a -like model, and averaging them together to form an accurate and adaptive size history estimator. On simulated data, tends to be faster and have lower error than several competing methods including ++, 2, and F C . Moreover, it provides a full posterior distribution over population size history, leading to automatic uncertainty quantification of the point estimates, as well to new Bayesian testing procedures for detecting population structure and ancient bottlenecks. On the technical side, the key advance is a novel algorithm for computing the score function (gradient of the log-likelihood) of a coalescent hidden Markov model: when there are hidden states, the algorithm requires. đť’Ş( ) time and. đť’Ş(1) memory per decoded position, the same cost as evaluating the log-likelihood itself using the naĂŻve forward algorithm. This algorithm is combined with a hand-tuned implementation that fully leverages the power of modern GPU hardware, and the entire method has been released as an easy-to-use Python software package.
Evolutionary Biology
What problem does this paper attempt to address?