Incorporating genomic annotation into single-step genomic prediction with imputed whole-genome sequence data
Teng Jin-yan,Ye Shoo-pan,Gao Ning,Chen Zi-tao,Diao Shu-qi,Li Xiu-jin,Yuan Xiao-long,Zhang Hao,Li Jia-qi,Zhang Xi-quan,Zhang Zhe
DOI: https://doi.org/10.1016/S2095-3119(21)63813-3
IF: 4.8
2022-01-01
Journal of Integrative Agriculture
Abstract:Single-step genomic best linear unbiased prediction (ssGBLUP) is now intensively investigated and widely used in livestock breeding due to its beneficial feature of combining information from both genotyped and ungenotyped individuals in the single model. With the increasing accessibility of whole-genome sequence (WGS) data at the population level, more attention is being paid to the usage of WGS data in ssGBLUP. The predictive ability of ssGBLUP using WGS data might be improved by incorporating biological knowledge from public databases. Thus, we extended ssGBLUP, incorporated genomic annotation information into the model, and evaluated them using a yellow-feathered chicken population as the examples. The chicken population consisted of 1 338 birds with 23 traits, where imputed WGS data including 5 127 612 single nucleotide polymorphisms (SNPs) are available for 895 birds. Considering different combinations of annotation information and models, original ssGBLUP, haplotype-based ssGHBLUP, and four extended ssGBLUP incorporating genomic annotation models were evaluated. Based on the genomic annotation (GRCg6a) of chickens, 3 155 524 and 94 837 SNPs were mapped to genic and exonic regions, respectively. Extended ssGBLUP using genic/exonic SNPs outperformed other models with respect to predictive ability in 15 out of 23 traits, and their advantages ranged from 2.5 to 6.1% compared with original ssGBLUP. In addition, to further enhance the performance of genomic prediction with imputed WGS data, we investigated the genotyping strategies of reference population on ssGBLUP in the chicken population. Comparing two strategies of individual selection for genotyping in the reference population, the strategy of evenly selection by family (SBF) performed slightly better than random selection in most situations. Overall, we extended genomic prediction models that can comprehensively utilize WGS data and genomic annotation information in the framework of ssGBLUP, and validated the idea that properly handling the genomic annotation information and WGS data increased the predictive ability of ssGBLUP. Moreover, while using WGS data, the genotyping strategy of maximizing the expected genetic relationship between the reference and candidate population could further improve the predictive ability of ssGBLUP. The results from this study shed light on the comprehensive usage of genomic annotation information in WGS-based single-step genomic prediction.