Genotype imputation for Han Chinese population using Haplotype Reference Consortium as reference

Yuan Lin,Lu Liu,Sen Yang,Yun Li,Dongxin Lin,Xuejun Zhang,Xianyong Yin
DOI: https://doi.org/10.1007/s00439-018-1894-z
2018-01-01
Human Genetics
Abstract:Genotype imputation is now routinely performed in genomic analysis. Reference panel size, that is, the number of haplotypes in the reference panel, has been well established to be one major driving factor of imputation accuracy. For that reason, huge efforts have been made worldwide to provide large reference panels, with the Haplotype Reference Consortium (HRC) being currently the largest available in the public domain. The imputation performance of HRC, whose major samples are Europeans, has been mainly evaluated in Europeans. We conducted whole-genome genotype imputation on two independent genome-wide genotyping datasets, one with 1000 European samples and the other with 1000 Han Chinese samples. We compared the results obtained using HRC with those using Phase III of the 1000 Genomes Project (1000G) reference panel. For the European dataset, using HRC improved imputation quality, especially for rare variants with minor allele-frequency (MAF) < 0.1%. However, 1000G demonstrates better performance in the Han Chinese dataset, in both imputation quality and number of well-imputed variants. We validated the performance of 1000G reference panel in a second, independent cohort of Han Chinese ( N = 2402). Our study showcases the limitations of HRC for Han Chinese populations, strongly suggesting the necessity of building population-specific reference panels.
What problem does this paper attempt to address?