Single-Molecule Sequencing and Chromatin Conformation Capture Enable De Novo Reference Assembly of the Domestic Goat Genome

Derek M Bickhart,Benjamin D Rosen,Sergey Koren,Brian L Sayre,Alex R Hastie,Saki Chan,Joyce Lee,Ernest T Lam,Ivan Liachko,Shawn T Sullivan,Joshua N Burton,Heather J Huson,John C Nystrom,Christy M Kelley,Jana L Hutchison,Yang Zhou,Jiajie Sun,Alessandra Crisà,F Abel Ponce de León,John C Schwartz,John A Hammond,Geoffrey C Waldbieser,Steven G Schroeder,George E Liu,Maitreya J Dunham,Jay Shendure,Tad S Sonstegard,Adam M Phillippy,Curtis P Van Tassell,Timothy P L Smith
DOI: https://doi.org/10.1038/ng.3802
IF: 30.8
2017-01-01
Nature Genetics
Abstract:The decrease in sequencing cost and increased sophistication of assembly algorithms for short-read platforms has resulted in a sharp increase in the number of species with genome assemblies. However, these assemblies are highly fragmented, with many gaps, ambiguities, and errors, impeding downstream applications. We demonstrate current state of the art for de novo assembly using the domestic goat (Capra hircus) based on long reads for contig formation, short reads for consensus validation, and scaffolding by optical and chromatin interaction mapping. These combined technologies produced what is, to our knowledge, the most continuous de novo mammalian assembly to date, with chromosome-length scaffolds and only 649 gaps. Our assembly represents a ∼400-fold improvement in continuity due to properly assembled gaps, compared to the previously published C. hircus assembly, and better resolves repetitive structures longer than 1 kb, representing the largest repeat family and immune gene complex yet produced for an individual of a ruminant species.
What problem does this paper attempt to address?