The Landscapes of Full-Length Transcripts and Splice Isoforms as Well as Transposons Exonization in the Lepidopteran Model System, <i>Bombyx mori</i>

Zongrui Dai,Jianyu Ren,Xiaoling Tong,Hai Hu,Kunpeng Lu,Fangyin Dai,Min-Jin Han
DOI: https://doi.org/10.3389/fgene.2021.704162
IF: 3.7
2021-01-01
Frontiers in Genetics
Abstract:The domesticated silkworm, Bombyx mori, is an important model system for the order Lepidoptera. Currently, based on third-generation sequencing, the chromosome-level genome of Bombyx mori has been released. However, its transcripts were mainly assembled by using short reads of second-generation sequencing and expressed sequence tags which cannot explain the transcript profile accurately. Here, we used PacBio Iso-Seq technology to investigate the transcripts from 45 developmental stages of Bombyx mori. We obtained 25,970 non-redundant high-quality consensus isoforms capturing similar to 60% of previous reported RNAs, 15,431 (similar to 47%) novel transcripts, and identified 7,253 long non-coding RNA (lncRNA) with a large proportion of novel lncRNA (similar to 56%). In addition, we found that transposable elements (TEs) exonization account for 11,671 (similar to 45%) transcripts including 5,980 protein-coding transcripts (similar to 32%) and 5,691 lncRNAs (similar to 79%). Overall, our results expand the silkworm transcripts and have general implications to understand the interaction between TEs and their host genes. These transcripts resource will promote functional studies of genes and lncRNAs as well as TEs in the silkworm.
What problem does this paper attempt to address?