Detect and Adjust for Population Stratification in Population-Based Association Study Using Genomic Control Markers: an Application of Affymetrix Genechip® Human Mapping 10K Array

Ke Hao,Cheng Li,Carsten Rosenow,Wing H Wong
DOI: https://doi.org/10.1038/sj.ejhg.5201273
2004-01-01
European Journal of Human Genetics
Abstract:Population-based association design is often compromised by false or nonreplicable findings, partially due to population stratification. Genomic control (GC) approaches were proposed to detect and adjust for this confounder. To date, the performance of this strategy has not been extensively evaluated on real data. More than 10 000 single-nucleotide polymorphisms (SNPs) were genotyped on subjects from four populations (including an Asian, an African-American and two Caucasian populations) using GeneChip ® Mapping 10 K array. On these data, we tested the performance of two GC approaches in different scenarios including various numbers of GC markers and different degrees of population stratification. In the scenario of substantial population stratification, both GC approaches are sensitive using only 20–50 random SNPs, and the mixed subjects can be separated into homogeneous subgroups. In the scenario of moderate stratification, both GC approaches have poor sensitivities. However, the bias in association test can still be corrected even when no statistical significant population stratification is detected. We conducted extensive benchmark analyses on GC approaches using SNPs over the whole human genome. We found GC method can cluster subjects to homogeneous subgroups if there is a substantial difference in genetic background. The inflation factor, estimated by GC markers, can effectively adjust for the confounding effect of population stratification regardless of its extent. We also suggest that as low as 50 random SNPs with heterozygosity >40% should be sufficient as genomic controls.
What problem does this paper attempt to address?