Benchmarking Imputed Low Coverage Genomes in a Human Population Genetics Context

Gludhug Ariyo Purnomo,Joao Carlos Teixeira,Herawati Sudoyo,Bastien Llamas,Raymond Tobler
DOI: https://doi.org/10.1101/2024.06.02.597067
2024-06-03
Abstract:Ongoing advances in population genomic methodologies have recently made it possible to study millions of loci across hundreds of genomes at a relatively low cost, by leveraging a combination of low-coverage shotgun sequencing and innovative genotype imputation methods. This approach has the potential to provide economical access to genotype information that is similar to most widely used low-cost genotyping approach, i.e. SNP panels, while avoiding potential issues related to loci being ascertained in distantly related populations. Nonetheless, adoption of imputation methods has been constrained by the lack of suitable reference panels of phased genomes, as performance degrades when panel individuals are distantly related to the target populations. Recent advances in imputation algorithms now allow genetic information from the target population to be used in the imputation process, however, potentially mitigating the lack of a suitable reference panel. Here we assess the performance of the recently released GLIMPSE imputation software on a set of 250 low coverage genomes (~3x) from populations from Island Southeast Asia and Near Oceania that are poorly represented in publicly available datasets, comparing the use of imputed genotypes against other common genotype calling methods for a range of standard population genomic analyses. We find that imputation performance and inference both greatly improved when genetic information from the 250 target individuals was leveraged, with comparable results to pseudo-haploid calls that trade off improved precision with reduced accuracy. Our study shows that imputed genotypes are a cost effective and robust basis for population genomic studies of groups, especially those that are poorly represented in publicly available data.
Evolutionary Biology
What problem does this paper attempt to address?