The effect of gene flow on coalescent-based species-tree inference

Colby Long,Laura Kubatko
DOI: https://doi.org/10.48550/arXiv.1710.03806
2017-10-11
Abstract:Most current methods for inferring species-level phylogenies under the coalescent model assume that no gene flow occurs following speciation. While some studies have examined the impact of gene flow on estimation accuracy for certain methods, limited analytical work has been undertaken to directly assess the potential effect of gene flow across a species phylogeny. In this paper, we consider a three-taxon isolation-with-migration model that allows gene flow between sister taxa for a brief period following speciation, as well as variation in the effective population sizes across the tree. We derive the probabilities of each of the three gene tree topologies under this model, and show that for certain choices of the gene flow and effective population size parameters, anomalous gene trees (i.e., gene trees that are discordant with the species tree but that have higher probability than the gene tree concordant with the species tree) exist. We characterize the region of parameter space producing anomalous trees, and show that the probability of the gene tree that is concordant with the species tree can be arbitrarily small. We then show that the SVDQuartets method is theoretically valid under the model of gene flow between sister taxa. We study its performance on simulated data and compare it to two other commonly-used methods for species tree inference, ASTRAL and MP-EST. The simulations show that ASTRAL and MP-EST can be statistically inconsistent when gene flow is present, while SVDQuartets performs well, though large sample sizes may be required for certain parameter choices.
Populations and Evolution
What problem does this paper attempt to address?