Abstract:Background: Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Results: Using high-throughput Illumina RNA-seq, the transcriptome from poly (A)+ RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs). Approximate 34.5 million reads were obtained, trimmed, and assembled into 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010). Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG) found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were analyzed by RT-PCR and quantitative real time PCR (qRT-PCR). Conclusions: An extensive transcriptome dataset has been obtained from the deep sequencing of tea plant. The coverage of the transcriptome is comprehensive enough to discover all known genes of several major metabolic pathways. This transcriptome dataset can serve as an important public information platform for gene expression, genomics, and functional genomic studies in C. sinensis.

Construction of cDNA library of Camellia sinensis cv. Ziyang 1 and primary analysis of expressed sequence tags (ESTs)

Construction of Tender Shoots Cdna Library and Preliminary Analysis of Expressed Sequence Tags Sequencing of Tea Plant

Generation and analysis of expressed sequence tags from the tender shoots cDNA library of tea plant (Camellia sinensis)

Development and Preliminary Application of Cdna Microarray of Tea Plant (camellia Sinensis)

Sequencing of Cdna Clones and Analysis of the Expressed Sequence Tags (ests) Properties of Young Tea Plant (camellia Sinensis) Shoots

Construction of EST Data Base and Establishment of Cdna Array

Expressed sequence tags from organ-specific cDNA libraries of tea (Camellia sinensis) and polymorphisms and transferability of EST-SSRs across Camellia species

Construction and Characterization of a Bacterial Artificial Chromosome Library for Camellia Sinensis

Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds

Recent progress in the molecular biology of tea (Camellia sinensis) based on the expressed sequence tag strategy: A review

Comparative transcriptome database for<i>Camellia sinensis</i>reveals genes important for flavonoid synthesis in tea plants

Progress in Functional Gene Cloning of Camellia sinensis

Floral transcriptome sequencing for SSR marker development and linkage map construction in the tea plant (Camellia sinensis)

Differential gene expression in tea (Camellia sinensis L.) calli with different morphologies and catechin contents

Comparative transcriptome database for Camellia sinensis reveals genes important for flavonoid synthesis in tea plants

Transcriptome Characterization for Camellia Sect.Oleifera Based on the 592 499 ESTs

Development of a Genome‐wide 200K SNP Array and Its Application for High‐density Genetic Mapping and Origin Analysis of Camellia Sinensis

Identification and characterization of 74 novel polymorphic EST-SSR markers in the tea plant, Camellia sinensis (Theaceae)

Complete chloroplast genome sequence of Camellia sinensis: genome structure, adaptive evolution, and phylogenetic relationships

Construction of a Cdna Library of Vitis Pseudoreticulata Native to China Inoculated with Uncinula Necator and the Analysis of Potential Defence-related Expressed Sequence Tags (Ests)

The Complete Chloroplast Genome Sequence of Camellia Sinensis Var. Sinensis Cultivar Tieguanyin (Theaceae)