Consequence of adjustments for demographic or clinical covariates and a recommended solution in genome-wide association studies

Xiaoru Sun,Hongkai Li,Yuanyuan Yu,Zhongshang Yuan,Chuandi Jin,Lei Hou,Xinhui Liu,Qing Wang,Fuzhong Xue
DOI: https://doi.org/10.1101/2021.12.07.471675
2021-01-01
bioRxiv
Abstract:Genome-wide association study (GWAS) is fundamentally designed to detect disease-causing genes. To reduce spurious associations or improve statistical power, about 80% of GWASs arbitrarily adjusted for demographic or clinical covariates. However, adjustment strategies in GWASs have not achieved consistent conclusions. Given the initial aim of GWAS that is to identify the causal association between a specific causal single-nucleotide polymorphism (SNP) and disease trait, we summarized all complex relationships of the target SNP, covariate and disease trait into 15 causal diagrams according to various roles of the covariate. Following each causal diagram, we conducted a series of theoretical justifications and statistical simulations. Our results demonstrate that it is unadvisable to adjust for any demographic or clinical covariates. We illustrate our point by applying GWASs for body mass index (BMI) and breast cancer, including adjusting and non-adjusting for age and smoking status. Genetic effects and P values might vary across different strategies. Instead, adjustments for SNPs ( G ′) should be strongly recommended when G ′ are in linkage disequilibrium with the target SNP, and correlated with disease trait conditional on the target SNP. Specifically, adjustment for such G ′ can block all the confounding paths between the target SNP and disease trait, and avoid over-adjusting for colliders or intermediaries.
What problem does this paper attempt to address?