Optimizing Strategy of Genotype Imputation from Low-Coverage Whole-Genome Sequencing in Scallops

Yujue Wang,Ruixing Yao,Liang Zhao,Qianqian Zhang,Moli Li,Xiangfu Kong,Pingping Liu,Shanhuan Huang,Chen Hu,Zhenmin Bao,Xiaoli Hu
DOI: https://doi.org/10.1016/j.aquaculture.2024.741492
IF: 5.135
2024-01-01
Aquaculture
Abstract:Advancements in sequencing technologies have facilitated the use of low-coverage whole-genome sequencing (lcWGS) for detecting millions of single nucleotide polymorphisms (SNPs) in large populations at a low cost. The effectiveness of this approach largely relies on genotype imputation following lcWGS. While several imputation methods have been applied in human and other terrestrial animals, their performance has yet to be fully assessed in aquaculture organism, especially molluscs species which contribute 70% mariculture production in China and are characterized by high genetic polymorphism. This study aims to optimize reference panel construction and evaluate imputation strategies for SNP detection in two scallops, Chlamys farreri and Patinopecten yessoensis, both of which are important massively cultured bivalve species. High-coverage whole-genome sequencing (hcWGS) data are utilized to construct haplotype reference panels, and three imputation methods are applied: Beagle5.4, GLIMPSE2, and QUILT. The results indicated that QUILT outperformed other methods in scallop imputation, achieving imputation quality scores (IQS) exceeding 0.930 in C. farreri and 0.946 in P. yessoensis. Additionally, QUILT yielded the highest number of imputed loci in both species. Furthermore, the panel construction strategy was optimized by adjusting for genotype quality (GQ) and minor allele frequency (MAF). It is determined that a GQ threshold above 20 and a MAF threshold >0.01 are optimal. The effects of sequencing depth and sample size on imputation quality were also assessed, revealing that as sequencing depth increases, imputation quality improves, plateauing at 0.5×. Additionally, the impact of sample size varies depending on the imputation method. Moreover, negative correlation was found between imputation quality and the ratio of nonsynonymous to synonymous substitution rates (Ka/Ks) across chromosomal segments, highlighting the influence of selection pressure in the genome on imputation efficacy. This study provides an optimized comprehensive framework for SNP genotyping using lcWGS in scallops, with implications for aquaculture breeding and genetic research.
What problem does this paper attempt to address?