Long‐read Sequencing and De Novo Assembly of the Luffa Cylindrica (L.) Roem. Genome

Tao Zhang,Xuyan Ren,Zhao Zhang,Yao Ming,Zhe Yang,Jianbin Hu,Shengli Li,Yong Wang,Shouru Sun,Kaile Sun,Fengzhi Piao,Zhiqiang Sun
DOI: https://doi.org/10.1111/1755-0998.13129
IF: 7.7
2019-01-01
Molecular Ecology Resources
Abstract:Sponge gourd (Luffa cylindrica (L.) Roem.) or luffa is a diploid herbaceous plant with 26 chromosomes (2n = 26) and belongs to the family Cucurbitaceae. To address the limited knowledge of the genome of Luffa species, the chromosome-level genome of L. cylindrica was assembled and analysed using PacBio long reads and Hi-C data. We combined Hi-C data with a draft genome assembly to generate chromosome-length scaffolds. Thirteen scaffolds corresponding to the 13 chromosomes were assembled from 1,156 contigs to a final size of 669 Mb with a contig N50 size of 5 Mb and a scaffold N50 size of 53 Mb. After removing redundant sequences, 416.31 Mb (62.18% of the genome) of repeat sequences was detected. Subsequently, 31,661 protein-coding genes with an average of 5.69 exons per gene were identified in the L. cylindrica genome using de novo methods, transcriptome data and homologue-based approaches. In addition, 27,552 protein-coding genes (87.02%) were annotated in five databases. According to the phylogenetic analysis, L. cylindrica is closely related to Cucurbita and Cucumis species and diverged from their common ancestor 28.6-67.1 million years ago. Genome collinearity analysis was performed in Cucurbita moschata, Cucumis sativus and L. cylindrica, and it demonstrated a high degree of conserved gene order in these three species. The completeness of the genome will provide high-quality genomic knowledge on breeding and reveal genetic variation in L. cylindrica.
What problem does this paper attempt to address?