Impact of genotype‐calling methodologies on genome‐wide association and genomic prediction in polyploids
Joyce N. Njuguna,Lindsay V. Clark,Alexander E. Lipka,Kossonou G. Anzoua,Larisa Bagmet,Pavel Chebukin,Maria S. Dwiyanti,Elena Dzyubenko,Nicolay Dzyubenko,Bimal Kumar Ghimire,Xiaoli Jin,Douglas A. Johnson,Jens Bonderup Kjeldsen,Hironori Nagano,Ivone Bem Oliveira,Junhua Peng,Karen Koefoed Petersen,Andrey Sabitov,Eun Soo Seong,Toshihiko Yamada,Ji Hye Yoo,Chang Yeon Yu,Hua Zhao,Patricio Munoz,Stephen P. Long,Erik J. Sacks,Ivone de Bem Oliveira
DOI: https://doi.org/10.1002/tpg2.20401
2023-11-01
The Plant Genome
Abstract:Discovery and analysis of genetic variants underlying agriculturally important traits are key to molecular breeding of crops. Reduced representation approaches have provided cost‐efficient genotyping using next‐generation sequencing. However, accurate genotype calling from next‐generation sequencing data is challenging, particularly in polyploid species due to their genome complexity. Recently developed Bayesian statistical methods implemented in available software packages, polyRAD, EBG, and updog, incorporate error rates and population parameters to accurately estimate allelic dosage across any ploidy. We used empirical and simulated data to evaluate the three Bayesian algorithms and demonstrated their impact on the power of genome‐wide association study (GWAS) analysis and the accuracy of genomic prediction. We further incorporated uncertainty in allelic dosage estimation by testing continuous genotype calls and comparing their performance to discrete genotypes in GWAS and genomic prediction. We tested the genotype‐calling methods using data from two autotetraploid species, Miscanthus sacchariflorus and Vaccinium corymbosum, and performed GWAS and genomic prediction. In the empirical study, the tested Bayesian genotype‐calling algorithms differed in their downstream effects on GWAS and genomic prediction, with some showing advantages over others. Through subsequent simulation studies, we observed that at low read depth, polyRAD was advantageous in its effect on GWAS power and limit of false positives. Additionally, we found that continuous genotypes increased the accuracy of genomic prediction, by reducing genotyping error, particularly at low sequencing depth. Our results indicate that by using the Bayesian algorithm implemented in polyRAD and continuous genotypes, we can accurately and cost‐efficiently implement GWAS and genomic prediction in polyploid crops.
genetics & heredity,plant sciences