Bioinformatics analysis of alternative polyadenylation in green alga Chlamydomonas reinhardtii using transcriptome sequences from three different sequencing platforms.

Zhixin Zhao,Xiaohui Wu,Praveen Kumar Raj Kumar,Min Dong,Guoli Ji,Qingshun Quinn Li,Chun Liang
DOI: https://doi.org/10.1534/g3.114.010249
2014-01-01
Abstract:Messenger RNA 3-end formation is an essential posttranscriptional processing step for most eukaryotic genes. Different from plants and animals where AAUAAA and its variants routinely are found as the main poly(A) signal, Chlamydomonas reinhardtii uses UGUAA as the major poly(A) signal. The advance of sequencing technology provides an enormous amount of sequencing data for us to explore the variations of poly(A) signals, alternative polyadenylation (APA), and its relationship with splicing in this algal species. Through genome-wide analysis of poly(A) sites in C. reinhardtii, we identified a large number of poly(A) sites: 21,041 from Sanger expressed sequence tags, 88,184 from 454, and 195,266 from Illumina sequence reads. In comparison with previous collections, more new poly(A) sites are found in coding sequences and intron and intergenic regions by deep-sequencing. Interestingly, G-rich signals are particularly abundant in intron and intergenic regions. The prevalence of different poly(A) signals between coding sequences and a 3-untranslated region implies potentially different polyadenylation mechanisms. Our data suggest that the APA occurs in about 68% of C. reinhardtii genes. Using Gene Ontolgy analysis, we found most of the APA genes are involved in RNA regulation and metabolic process, protein synthesis, hydrolase, and ligase activities. Moreover, intronic poly(A) sites are more abundant in constitutively spliced introns than retained introns, suggesting an interplay between polyadenylation and splicing. Our results support that APA, as in higher eukaryotes, may play significant roles in increasing transcriptome diversity and gene expression regulation in this algal species. Our datasets also provide useful information for accurate annotation of transcript ends in C. reinhardtii.
What problem does this paper attempt to address?