Single-molecule sequencing and optical mapping yields an improved genome of woodland strawberry (Fragaria vesca) with chromosome-scale contiguity

Patrick P Edger,Robert VanBuren,Marivi Colle,Thomas J Poorten,Ching Man Wai,Chad E Niederhuth,Elizabeth I Alger,Shujun Ou,Charlotte B Acharya,Jie Wang,Pete Callow,Michael R McKain,Jinghua Shi,Chad Collier,Zhiyong Xiong,Jeffrey P Mower,Janet P Slovin,Timo Hytönen,Ning Jiang,Kevin L Childs,Steven J Knapp
DOI: https://doi.org/10.1093/gigascience/gix124
IF: 7.658
2017-12-13
GigaScience
Abstract:Background: Although draft genomes are available for most agronomically important plant species, the majority are incomplete, highly fragmented, and often riddled with assembly and scaffolding errors. These assembly issues hinder advances in tool development for functional genomics and systems biology.Findings: Here we utilized a robust, cost-effective approach to produce high-quality reference genomes. We report a near-complete genome of diploid woodland strawberry (Fragaria vesca) using single-molecule real-time sequencing from Pacific Biosciences (PacBio). This assembly has a contig N50 length of ∼7.9 million base pairs (Mb), representing a ∼300-fold improvement of the previous version. The vast majority (>99.8%) of the assembly was anchored to 7 pseudomolecules using 2 sets of optical maps from Bionano Genomics. We obtained ∼24.96 Mb of sequence not present in the previous version of the F. vesca genome and produced an improved annotation that includes 1496 new genes. Comparative syntenic analyses uncovered numerous, large-scale scaffolding errors present in each chromosome in the previously published version of the F. vesca genome.Conclusions: Our results highlight the need to improve existing short-read based reference genomes. Furthermore, we demonstrate how genome quality impacts commonly used analyses for addressing both fundamental and applied biological questions.
multidisciplinary sciences
What problem does this paper attempt to address?