Pacific Biosciences long reads-based genome sequencing data from a widespread bee fungal parasite, Nosema ceranae

Huazhi Chen,Wende Zhang,Yu Du,Xiaoxue Fan,Jie Wang,Haibin Jiang,Yuanchan Fan,Zhiwei Zhu,Cuiling Xiong,Yanzhen Zheng,Dafu Chen,Rui Guo
DOI: https://doi.org/10.1101/2020.04.05.026849
2020-01-01
Abstract: is a widespread fungal parasite that infects both adult honeybee and honeybee larvae, leading to microsporidiosis, which seriously affects bee health and apicultural industry. In this article, genome sequencing of clean spores of was conducted using third-generation Pacific Biosciences (PacBio) single molecule real time (SMRT) sequencing technology. In total, 152671 subreads were obtained after quality control of raw reads from PacBio SMRT sequencing, with a N50 and average length of 14422 bp and 11310 bp, respectively. Additionally, the length distribution of subreads was from 10000 bp to more than 50000 bp. Nineteen scaffords with a total length of 7354221 bp were assembled, and the N50, N90 and maximum scafford length were 728543 bp, 198795 bp and 1917792 bp, respectively. The GC content was 25.97%. Furthermore, by integration of genes predicted from and homology-based methods, 3112 genes were finally assembled, with a total length of 2730179 bp and mean length of 877.31 bp. In addition, the total length and mean length of exons were 2657637 bp and 854 bp, respectively; and the total length and mean length of introns were 72542 bp and 23.31 bp, respectively. The genome sequencing data documented here will give deep insights into the molecular biology of , facilitate exploration of genes and pathways associated with toxin factors and infection-related factors, and benefit research on comparative genomics and phylogenetic diversity of species.
What problem does this paper attempt to address?