Haplotype Allele Frequency (HAF) Score: Predicting Carriers of Ongoing Selective Sweeps Without Knowledge of the Adaptive Allele

Roy Ronen,Glenn Tesler,Ali Akbari,Shay Zakov,Noah A. Rosenberg,Vineet Bafna
DOI: https://doi.org/10.1007/978-3-319-16706-0_28
2015-01-01
Abstract:Methods for detecting the genomic signatures of natural selection are heavily studied, and have been successful in identifying many selective sweeps. For the vast majority of these sweeps the adaptive allele remains unknown, making it difficult to distinguish carriers of the sweep from non-carriers. Because carriers of ongoing selective sweeps are likely to contain a future most recent common ancestor, identifying them may prove useful in predicting the evolutionary trajectory– for example, in contexts involving drug-resistant pathogen strains or cancer subclones. The main contribution of this paper is the development and analysis of a new statistic, the Haplotype Allele Frequency (HAF) score, assigned to individual haplotypes in a sample. The HAF score naturally captures many of the properties shared by haplotypes carrying an adaptive allele. We provide a theoretical model for the behavior of the HAF score under different evolutionary scenarios, and validate the interpretation of the statistic with simulated data. We develop an algorithm ($$\text {PreCIOSS}$$: Predicting Carriers of Ongoing Selective Sweeps) to identify carriers of the adaptive allele in selective sweeps, and we demonstrate its power on simulations of both hard and soft selective sweeps, as well as on data from well-known sweeps in human populations.
What problem does this paper attempt to address?