RNA-Seq improves annotation of protein-coding genes in the cucumber genome

Zhen Li,Zhonghua Zhang,Pengcheng Yan,Sanwen Huang,Zhangjun Fei,Kui Lin
DOI: https://doi.org/10.1186/1471-2164-12-540
IF: 4.547
2011-01-01
BMC Genomics
Abstract:As more and more genomes are sequenced, genome annotation becomes increasingly important in bridging the gap between sequence and biology. Gene prediction, which is at the center of genome annotation, usually integrates various resources to compute consensus gene structures. However, many newly sequenced genomes have limited resources for gene predictions. In an effort to create high-quality gene models of the cucumber genome (Cucumis sativus var. sativus), based on the EVidenceModeler gene prediction pipeline, we incorporated the massively parallel complementary DNA sequencing (RNA-Seq) reads of 10 cucumber tissues into EVidenceModeler. We applied the new pipeline to the reassembled cucumber genome and included a comparison between our predicted protein-coding gene sets and a published set.
What problem does this paper attempt to address?