FSTest: an efficient tool for cross-population fixation index estimation on variant call format files

Vahedi, Seyed Milad
DOI: https://doi.org/10.1007/s12041-023-01459-1
2024-01-05
Journal of Genetics
Abstract:Fixation index ( F st ) statistics provide critical insights into evolutionary processes affecting the structure of genetic variation within and among populations. F st statistics have been widely applied in population and evolutionary genetics to identify genomic regions targeted by selection pressures. The FSTest 1.3 software was developed to estimate four F st statistics of Hudson, Weir and Cockerham, Nei, and Wright using high-throughput genotyping or sequencing data. Here, we introduced FSTest 1.3 and compared its performance with two widely used software VCFtools 0.1.16 and PLINK 2.0. Chromosome 1 of 1000 Genomes Phase III variant data belonging to South Asian ( n = 211) and African ( n = 274) populations were included as an example case in this study. Different F st estimates were calculated for each single-nucleotide polymorphism (SNP) in a pairwise comparison of South Asian against African populations, and the results of FSTest 1.3 were confirmed by VCFtools 0.1.16 and PLINK 2.0. Two different sliding window approaches, one based on a fixed number of SNPs and another based on a fixed number of base pair (bp) were conducted using FSTest 1.3 and VCFtools 0.1.16. Our results showed that regions with low coverage genotypic data could lead to an overestimation of F st in sliding window analysis using a fixed number of bp. FSTest 1.3 could mitigate this challenge by estimating the average of consecutive SNPs along the chromosome. FSTest 1.3 allows direct analysis of VCF files with a small amount of code and can calculate F st estimates on a desktop computer for more than a million SNPs in a few minutes. FSTest 1.3 is freely available at https://github.com/similab/FSTest.
genetics & heredity
What problem does this paper attempt to address?