Full-length transcript sequencing accelerates the transcriptome research of <i>Gymnocypris namensis</i>, an iconic fish of the Tibetan Plateau
Hui Luo,Haiping Liu,Jie Zhang,Bingjie Hu,Chaowei Zhou,Mengbin Xiang,Yuejing Yang,Mingrui Zhou,Tingsen Jing,Zhe Li,Xinghua Zhou,Guangjun Lv,Wenping He,Benhe Zeng,Shijun Xiao,Qinglu Li,Hua Ye
DOI: https://doi.org/10.1038/s41598-020-66582-w
IF: 4.6
2020-01-01
Scientific Reports
Abstract:Gymnocypris namensis, the only commercial fish in Namtso Lake of Tibet in China, is rated as nearly threatened species in the Red List of China's Vertebrates. As one of the highest-altitude schizothorax fish in China, G. namensis has strong adaptability to the plateau harsh environment. Although being an indigenous economic fish with high value in research, the biological characterization, genetic diversity, and plateau adaptability of G. namensis are still unclear. Here, we used Pacific Biosciences single molecular real time long read sequencing technology to generate full-length transcripts of G. namensis. Sequences clustering analysis and error correction with Illumina-produced short reads to obtain 319,044 polished isoforms. After removing redundant reads, 125,396 non-redundant isoforms were obtained. Among all transcripts, 103,286 were annotated to public databases. Natural selection has acted on 42 genes for G. namensis, which were enriched on the functions of mismatch repair and Glutathione metabolism. Total 89,736 open reading frames, 95,947 microsatellites, and 21,360 long non-coding RNAs were identified across all transcripts. This is the first study of transcriptome in G. namensis by using PacBio Iso-seq. The acquisition of full-length transcript isoforms might accelerate the transcriptome research of G. namensis and provide basis for further research.