Chromosome-level genome assembly of Solanum pimpinellifolium

Hongyu Han,Xiuhong Li,Tianze Li,Qian Chen,Jiuhai Zhao,Huawei Zhai,Lei Deng,Xianwen Meng,Chuanyou Li
DOI: https://doi.org/10.1038/s41597-024-03442-6
2024-06-05
Scientific Data
Abstract:Solanum pimpinellifolium , the closest wild relative of the domesticated tomato, has high potential for use in breeding programs aimed at developing multi-pathogen resistance and quality improvement. We generated a chromosome-level genome assembly of S. pimpinellifolium LA1589, with a size of 833 Mb and a contig N50 of 31 Mb. We anchored 98.80% of the contigs into 12 pseudo-chromosomes, and identified 74.47% of the sequences as repetitive sequences. The genome evaluation revealed BUSCO and LAI score of 98.3% and 14.49, respectively, indicating high quality of this assembly. A total of 41,449 protein-coding genes were predicted in the genome, of which 89.17% were functionally annotated. This high-quality genome assembly serves as a valuable resource for accelerating the biological discovery and molecular breeding of this important horticultural crop.
multidisciplinary sciences
What problem does this paper attempt to address?
The problem that this paper aims to solve is to accelerate biological discovery and molecular breeding of the wild tomato species *Solanum pimpinellifolium* LA1589 through generating high - quality chromosome - level genome assemblies. Specifically, as a wild relative of cultivated tomatoes, *S. pimpinellifolium* has the potential for multi - pathogen resistance and quality improvement, so its genome information is particularly important for breeding programs. However, the previously released draft genome of this variety was of low quality and could not fully reveal sequence variations and their impacts on important traits. To this end, the researchers assembled a high - quality chromosome - level genome using short - read sequencing, PacBio sequencing, Hi - C scaffolding technology and Bionano optical mapping technology. The newly assembled genome has a total length of 833 Mb, a contig N50 of 31 Mb, a BUSCO completeness rate of 98.3% and an LAI score of 14.49, indicating that it is a high - quality genome assembly. This high - quality genome provides a valuable resource for future research on genetic changes during tomato domestication and for promoting genome - scale breeding.