Leveraging shared ancestral variation to detect local introgression

Lesly Lopez Fang,David Peede,Diego Ortega-Del Vecchyo,Emily Jane McTavish,Emilia Huerta-Sanchez
DOI: https://doi.org/10.1371/journal.pgen.1010155
IF: 4.5
2024-01-09
PLoS Genetics
Abstract:Introgression is a common evolutionary phenomenon that results in shared genetic material across non-sister taxa. Existing statistical methods such as Patterson's D statistic can detect introgression by measuring an excess of shared derived alleles between populations. The D statistic is effective to detect genome-wide patterns of introgression but can give spurious inferences of introgression when applied to local regions. We propose a new statistic, D + , that leverages both shared ancestral and derived alleles to infer local introgressed regions. Incorporating both shared derived and ancestral alleles increases the number of informative sites per region, improving our ability to identify local introgression. We use a coalescent framework to derive the expected value of this statistic as a function of different demographic parameters under an instantaneous admixture model and use coalescent simulations to compute the power and precision of D + . While the power of D and D + is comparable, D + has better precision than D . We apply D + to empirical data from the 1000 Genome Project and Heliconius butterflies to infer local targets of introgression in humans and in butterflies. Characterizing how pervasive introgression is across the tree of life is an outstanding question in evolutionary biology. To address this question, we need to detect and quantify introgression to investigate how natural selection has acted on introgressed genetic variation. The D statistic is a widely used method to detect introgression at the genome level, but this method cannot accurately detect introgression locally in the genome. To improve its performance at the local level, we incorporate ancestral variation shared between the donor and recipient populations. We show, theoretically and with simulations, that re-introduced ancestral alleles into the recipient population also contain information to detect introgression. Using both shared derived and ancestral variation, we define a new statistic, D + , that can be used to detect the location of introgressed regions in a genome with a history of introgression.
genetics & heredity
What problem does this paper attempt to address?