The Coexistence of Copy Number Variations (cnvs) and Single Nucleotide Polymorphisms (snps) at a Locus Can Result in Distorted Calculations of the Significance in Associating SNPs to Disease

Jiaqi Liu,Yangzhong Zhou,Sen Liu,Xiaofei Song,Xin-Zhuang Yang,Yanhui Fan,Weisheng Chen,Zeynep Coban Akdemir,Zihui Yan,Yuzhi Zuo,Renqian Du,Zhenlei Liu,Bo Yuan,Sen Zhao,Gang Liu,Yixin Chen,Yanxue Zhao,Mao Lin,Qiankun Zhu,Yuchen Niu,Pengfei Liu,Shiro Ikegawa,You-Qiang Song,Jennifer E. Posey,Guixing Qiu,DISCO (Deciphering disorders Involving Scoliosis and COmorbidities) Study,Feng Zhang,Zhihong Wu,James R. Lupski,Nan Wu
DOI: https://doi.org/10.1007/s00439-018-1910-3
2018-01-01
Human Genetics
Abstract:With the recent advance in genome-wide association studies (GWAS), disease-associated single nucleotide polymorphisms (SNPs) and copy number variants (CNVs) have been extensively reported. Accordingly, the issue of incorrect identification of recombination events that can induce the distortion of multi-allelic or hemizygous variants has received more attention. However, the potential distorted calculation bias or significance of a detected association in a GWAS due to the coexistence of CNVs and SNPs in the same genomic region may remain under-recognized. Here we performed the association study within a congenital scoliosis (CS) cohort whose genetic etiology was recently elucidated as a compound inheritance model, including mostly one rare variant deletion CNV null allele and one common variant non-coding hypomorphic haplotype of the TBX6 gene. We demonstrated that the existence of a deletion in TBX6 led to an overestimation of the contribution of the SNPs on the hypomorphic allele. Furthermore, we generalized a model to explain the calculation bias, or distorted significance calculation for an association study, that can be ‘induced’ by CNVs at a locus. Meanwhile, overlapping between the disease-associated SNPs from published GWAS and common CNVs (overlap 10%) and pathogenic/likely pathogenic CNVs (overlap 99.69%) was significantly higher than the random distribution (p < 1 × 10−6 and p = 0.034, respectively), indicating that such co-existence of CNV and SNV alleles might generally influence data interpretation and potential outcomes of a GWAS. We also verified and assessed the influence of colocalizing CNVs to the detection sensitivity of disease-associated SNP variant alleles in another adolescent idiopathic scoliosis (AIS) genome-wide association study. We proposed that detecting co-existent CNVs when evaluating the association signals between SNPs and disease traits could improve genetic model analyses and better integrate GWAS with robust Mendelian principles.
What problem does this paper attempt to address?