Predicting HLA Alleles from High-Resolution SNP Data in Three Southeast Asian Populations.

Nisha Esakimuthu Pillai,Yukinori Okada,Woei-Yuh Saw,Rick Twee-Hee Ong,Xu Wang,Erwin Tantoso,Wenting Xu,Trevor A. Peterson,Thomas Bielawny,Mohammad Ali,Koon-Yong Tay,Wan-Ting Poh,Linda Wei-Lin Tan,Seok-Hwee Koo,Wei-Yen Lim,Richie Soong,Markus Wenk,Soumya Raychaudhuri,Peter Little,Francis A. Plummer,Edmund J. D. Lee,Kee-Seng Chia,Ma Luo,Paul I. W. De Bakker,Yik-Ying Teo
DOI: https://doi.org/10.1093/hmg/ddu149
2014-01-01
Abstract:The major histocompatibility complex (MHC) containing the classical human leukocyte antigen (HLA) Class I and Class II genes is among the most polymorphic and diverse regions in the human genome. Despite the clinical importance of identifying the HLA types, very few databases jointly characterize densely genotyped single nucleotide polymorphisms (SNPs) and HLA alleles in the same samples. To date, the HapMap presents the only public resource that provides a SNP reference panel for predicting HLA alleles, constructed with four collections of individuals of north-western European, northern Han Chinese, cosmopolitan Japanese and Yoruba Nigerian ancestry. Owing to complex patterns of linkage disequilibrium in this region, it is unclear whether the HapMap reference panels can be appropriately utilized for other populations. Here, we describe a public resource for the Singapore Genome Variation Project with: (i) dense genotyping across ∼ 9000 SNPs in the MHC; (ii) four-digit HLA typing for eight Class I and Class II loci, in 96 southern Han Chinese, 89 Southeast Asian Malays and 83 Tamil Indians. This resource provides population estimates of the frequencies of HLA alleles at these eight loci in the three population groups, particularly for HLA-DPA1 and HLA-DPB1 that were not assayed in HapMap. Comparing between population-specific reference panels and a cosmopolitan panel created from all four HapMap populations, we demonstrate that more accurate imputation is obtained with population-specific panels than with the cosmopolitan panel, especially for the Malays and Indians but even when imputing between northern and southern Han Chinese. As with SNP imputation, common HLA alleles were imputed with greater accuracy than low-frequency variants.
What problem does this paper attempt to address?