A high-quality reference panel reveals the complexity and distribution of structural genome changes in a human population

Jayne Y. Hehir-Kwa,Tobias Marschall,Wigard P. Kloosterman,Laurent C. Francioli,Jasmijn A. Baaijens,Louis J. Dijkstra,Abdel Abdellaoui,Vyacheslav Koval,Djie Tjwan Thung,René Wardenaar,Bradley P. Coe,Patrick Deelen,Joep de Ligt,Eric-Wubbo Lameijer,Freerk van Dijk,Fereydoun Hormozdiari,Evan E. Eichler,Paul I. W. de Bakker,Morris A. Swertz,Cisca Wijmenga,Gert-Jan B. van Ommen,Eline Slagboom,Dorret I. Boomsma,Alexander Schoenhuth,Kai Ye,Victor Guryev
DOI: https://doi.org/10.1101/036897
2016-01-01
bioRxiv
Abstract:Structural variation (SV) represents a major source of differences between individual human genomes and has been linked to disease phenotypes. However, current studies on SVs have failed to provide a global view of the full spectrum of SVs and to integrate them into reference panels of genetic variation. Here, we analyzed 769 individuals from 250 Dutch families, whole genome sequenced at an average coverage of 14.5x, and provide a haplotype-resolved map of 1.9 million genome variants across 9 different variant classes, including novel forms of complex indels and retrotransposition-mediated insertions of mobile elements and processed RNAs. A large proportion of the structural variants (36%) were discovered in the size range of 21 to 100bp, a size range which remains under reported in many studies. Furthermore, we detected 4 megabases of novel sequence, extending the human pangenome with 11 new active transcripts. Finally, we show 191 known, trait-associated SNPs to be in strong linkage disequilibrium with a structural variant and demonstrate that our panel facilitates accurate imputation of SVs into unrelated individuals, which is essential for future genome-wide association studies.
What problem does this paper attempt to address?