A cross-disorder dosage sensitivity map of the human genome

Ryan L. Collins,Joseph T. Glessner,Eleonora Porcu,Lisa-Marie Niestroj,Jacob Ulirsch,Georgios Kellaris,Daniel P. Howrigan,Selin Everett,Kiana Mohajeri,Xander Nuttle,Chelsea Lowther,Jack Fu,Philip M. Boone,Farid Ullah,Kaitlin E. Samocha,Konrad Karczewski,Diane Lucente,James F. Gusella,Hilary Finucane,Ludmilla Matyakhina,Swaroop Aradhya,Jeanne Meck,Dennis Lal,Benjamin M. Neale,Jennelle C. Hodge,Alexandre Reymond,Zoltan Kutalik,Nicholas Katsanis,Erica E. Davis,Hakon Hakonarson,Shamil Sunyaev,Harrison Brand,Michael E. Talkowski,
DOI: https://doi.org/10.1101/2021.01.26.21250098
2021-01-28
Abstract:SUMMARY Rare deletions and duplications of genomic segments, collectively known as rare copy number variants (rCNVs), contribute to a broad spectrum of human diseases. To date, most disease-association studies of rCNVs have focused on recognized genomic disorders or on the impact of haploinsufficiency caused by deletions. By comparison, our understanding of duplications in disease remains rudimentary as very few individual genes are known to be triplosensitive ( i . e ., duplication intolerant). In this study, we meta-analyzed rCNVs from 753,994 individuals across 30 primarily neurological disease phenotypes to create a genome-wide catalog of rCNV association statistics across disorders. We discovered 114 rCNV-disease associations at 52 distinct loci surpassing genome-wide significance (P=3.72×10 −6 ), 42% of which involve duplications. Using Bayesian fine-mapping methods, we further prioritized 38 novel triplosensitive disease genes ( e . g ., GMEB2 in brain abnormalities), including three known haploinsufficient genes that we now reveal as bidirectionally dosage sensitive ( e . g ., ANKRD11 in growth abnormalities). By integrating our results with prior literature, we found that disease-associated rCNV segments were enriched for genes constrained against damaging coding variation and identified likely dominant driver genes for about one-third (32%) of rCNV segments based on de novo mutations from exome sequencing studies of developmental disorders. However, while the presence of constrained driver genes was a common feature of many pathogenic large rCNVs across disorders, most of the rCNVs showing genome-wide significant association were incompletely penetrant (mean odds ratio=11.6) and we also identified two examples of noncoding disease-associated rCNVs ( e . g ., intronic CADM2 deletions in behavioral disorders). Finally, we developed a statistical model to predict dosage sensitivity for all genes, which defined 3,006 haploinsufficient and 295 triplosensitive genes where the effect sizes of rCNVs were comparable to deletions of genes constrained against truncating mutations. These dosage sensitivity scores classified disease genes across molecular mechanisms, prioritized pathogenic de novo rCNVs in children with autism, and revealed features that distinguished haploinsufficient and triplosensitive genes, such as insulation from other genes and local cis -regulatory complexity. Collectively, the cross-disorder rCNV maps and metrics derived in this study provide the most comprehensive assessment of dosage sensitive genomic segments and genes in disease to date and set the foundation for future studies of dosage sensitivity throughout the human genome.
What problem does this paper attempt to address?