Imputed Genotypes Versus Sequenced Genotypes for the Association Analysis of Rare Variants

I. V. Zorkoltseva,T. I. Axenovich,Y. A. Tsepilov
DOI: https://doi.org/10.1134/s1022795424701126
2024-12-07
Russian Journal of Genetics
Abstract:Exome-sequenced genotypes provide the most informative material for the analysis of rare genetic variants. However, their widespread use is currently limited by the relatively small number of sequenced samples compared to imputed samples and the lack of free access to personal genotypes. This latter drawback of sequenced data is not critical for imputed data that combine genotypes collected on microarray platforms and missing genotypes reconstructed using reference haplotype panels. The results of genome-wide association studies (GWAS) of imputed genotypes are freely available for thousands of traits and millions of genetic variants. These data can be used for gene-based association analysis, which is the primary tool for studying rare variants. However, imputed genotypes have disadvantages compared to sequenced genotypes. The number and quality of imputed genotypes are lower than those of the sequenced genotypes. We aimed to test how these disadvantages affect the results of rare variant analysis. We considered 188 236 participants in the UK Biobank project who had both imputed and sequenced genotypes. The results of the single-variant association analysis showed a high quality of imputation. Inflation factors for 47 traits were around 1, and p -values were very close to those obtained for sequenced genotypes ( r 2 = 0.994). We performed the gene-based association analysis using imputed and sequenced genotypes. The number of association signals identified using imputed data was approximately half that for sequenced data. It is expected that if the sample of imputed genotypes is twice as large as the sample of sequenced data, the power of the imputed data analysis should be equivalent to that of the sequenced data for the protein-coding variants.
genetics & heredity
What problem does this paper attempt to address?