Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program

Daniel Taliun,Daniel N. Harris,Michael D. Kessler,Jedidiah Carlson,Zachary A. Szpiech,Raul Torres,Sarah A. Gagliano Taliun,André Corvelo,Stephanie M. Gogarten,Hyun Min Kang,Achilleas N. Pitsillides,Jonathon LeFaive,Seung-been Lee,Xiaowen Tian,Brian L. Browning,Sayantan Das,Anne-Katrin Emde,Wayne E. Clarke,Douglas P. Loesch,Amol C. Shetty,Thomas W. Blackwell,Albert V. Smith,Quenna Wong,Xiaoming Liu,Matthew P. Conomos,Dean M. Bobo,François Aguet,Christine Albert,Alvaro Alonso,Kristin G. Ardlie,Dan E. Arking,Stella Aslibekyan,Paul L. Auer,John Barnard,R. Graham Barr,Lucas Barwick,Lewis C. Becker,Rebecca L. Beer,Emelia J. Benjamin,Lawrence F. Bielak,John Blangero,Michael Boehnke,Donald W. Bowden,Jennifer A. Brody,Esteban G. Burchard,Brian E. Cade,James F. Casella,Brandon Chalazan,Daniel I. Chasman,Yii-Der Ida Chen,Michael H. Cho,Seung Hoan Choi,Mina K. Chung,Clary B. Clish,Adolfo Correa,Joanne E. Curran,Brian Custer,Dawood Darbar,Michelle Daya,Mariza de Andrade,Dawn L. DeMeo,Susan K. Dutcher,Patrick T. Ellinor,Leslie S. Emery,Celeste Eng,Diane Fatkin,Tasha Fingerlin,Lukas Forer,Myriam Fornage,Nora Franceschini,Christian Fuchsberger,Stephanie M. Fullerton,Soren Germer,Mark T. Gladwin,Daniel J. Gottlieb,Xiuqing Guo,Michael E. Hall,Jiang He,Nancy L. Heard-Costa,Susan R. Heckbert,Marguerite R. Irvin,Jill M. Johnsen,Andrew D. Johnson,Robert Kaplan,Sharon L. R. Kardia,Tanika Kelly,Shannon Kelly,Eimear E. Kenny,Douglas P. Kiel,Robert Klemmer,Barbara A. Konkle,Charles Kooperberg,Anna Köttgen,Leslie A. Lange,Jessica Lasky-Su,Daniel Levy,Xihong Lin,Keng-Han Lin,Chunyu Liu,Ruth J. F. Loos,Lori Garman,Robert Gerszten,Steven A. Lubitz,Kathryn L. Lunetta,Angel C. Y. Mak,Ani Manichaikul,Alisa K. Manning,Rasika A. Mathias,David D. McManus,Stephen T. McGarvey,James B. Meigs,Deborah A. Meyers,Julie L. Mikulla,Mollie A. Minear,Braxton D. Mitchell,Sanghamitra Mohanty,May E. Montasser,Courtney Montgomery,Alanna C. Morrison,Joanne M. Murabito,Andrea Natale,Pradeep Natarajan,Sarah C. Nelson,Kari E. North,Jeffrey R. O’Connell,Nicholette D. Palmer,Nathan Pankratz,Gina M. Peloso,Patricia A. Peyser,Jacob Pleiness,Wendy S. Post,Bruce M. Psaty,D. C. Rao,Susan Redline,Alexander P. Reiner,Dan Roden,Jerome I. Rotter,Ingo Ruczinski,Chloé Sarnowski,Sebastian Schoenherr,David A. Schwartz,Jeong-Sun Seo,Sudha Seshadri,Vivien A. Sheehan,Wayne H. Sheu,M. Benjamin Shoemaker,Nicholas L. Smith,Jennifer A. Smith,Nona Sotoodehnia,Adrienne M. Stilp,Weihong Tang,Kent D. Taylor,Marilyn Telen,Timothy A. Thornton,Russell P. Tracy,David J. Van Den Berg,Ramachandran S. Vasan,Karine A. Viaud-Martinez,Scott Vrieze,Daniel E. Weeks,Bruce S. Weir,Scott T. Weiss,Lu-Chen Weng,Cristen J. Willer,Yingze Zhang,Xutong Zhao,Donna K. Arnett,Allison E. Ashley-Koch,Kathleen C. Barnes,Eric Boerwinkle,Stacey Gabriel,Richard Gibbs,Kenneth M. Rice,Stephen S. Rich,Edwin K. Silverman,Pankaj Qasba,Weiniu Gan,George J. Papanicolaou,Deborah A. Nickerson,Sharon R. Browning,Michael C. Zody,Sebastian Zöllner,James G. Wilson,L. Adrienne Cupples,Cathy C. Laurie,Cashell E. Jaquish,Ryan D. Hernandez,Timothy D. O’Connor,Gonçalo R. Abecasis,
DOI: https://doi.org/10.1038/s41586-021-03205-y
IF: 64.8
2021-02-10
Nature
Abstract:Abstract The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes) 1 . In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.
multidisciplinary sciences
What problem does this paper attempt to address?