Abstract:BACKGROUND:One of the essential processing events during pre-mRNA maturation is the post-transcriptional addition of a polyadenine [poly(A)] tail. The 3'-end poly(A) track protects mRNA from unregulated degradation, and indicates the integrity of mRNA through recognition by mRNA export and translation machinery. The position of a poly(A) site is predetermined by signals in the pre-mRNA sequence that are recognized by a complex of polyadenylation factors. These signals are generally tri-part sequence patterns around the cleavage site that serves as the future poly(A) site. In plants, there is little sequence conservation among these signal elements, which makes it difficult to develop an accurate algorithm to predict the poly(A) site of a given gene. We attempted to solve this problem.RESULTS:Based on our current working model and the profile of nucleotide sequence distribution of the poly(A) signals and around poly(A) sites in Arabidopsis, we have devised a Generalized Hidden Markov Model based algorithm to predict potential poly(A) sites. The high specificity and sensitivity of the algorithm were demonstrated by testing several datasets, and at the best combinations, both reach 97%. The accuracy of the program, called poly(A) site sleuth or PASS, has been demonstrated by the prediction of many validated poly(A) sites. PASS also predicted the changes of poly(A) site efficiency in poly(A) signal mutants that were constructed and characterized by traditional genetic experiments. The efficacy of PASS was demonstrated by predicting poly(A) sites within long genomic sequences.CONCLUSION:Based on the features of plant poly(A) signals, a computational model was built to effectively predict the poly(A) sites in Arabidopsis genes. The algorithm will be useful in gene annotation because a poly(A) site signifies the end of the transcript. This algorithm can also be used to predict alternative poly(A) sites in known genes, and will be useful in the design of transgenes for crop genetic engineering by predicting and eliminating undesirable poly(A) sites.

Chapter 2 Prediction of Plant mRNA Polyadenylation Sites

Prediction of plant mRNA polyadenylation sites.

Predictive Modeling of Plant Messenger RNA Polyadenylation Sites

A Classification-Based Prediction Model of Messenger RNA Polyadenylation Sites

Implementation of a Classification-Based Prediction Model for Plant Mrna Poly(A) Sites

Computational Analysis of Plant Polyadenylation Signals.

PASPA: a Web Server for Mrna Poly(a) Site Predictions in Plants and Algae

Characterization and Prediction of Mrna Alternative Polyadenylation Sites in Rice Genes.

Comprehensive recognition of messenger RNA polyadenylation patterns in plants

Prediction Of Plant Poly(A) Sites Based On Ghmm-Rwt

Genome-wide Identification and Predictive Modeling of Polyadenylation Sites in Eukaryotes

Discovering putative peptides encoded from non-coding RNAs in ribosome profiling data of Arabidopsis thaliana.

A Research on Prediction of Polyadenylation Sites Based on Neural Network

Genome Level Analysis of Rice Mrna 3′-End Processing Signals and Alternative Polyadenylation

Messenger RNA Polyadenylation Site Recognition in Green Alga Chlamydomonas Reinhardtii

Identification Of Plant Messenger Rna Polyadenylation Sites Using Length-Variable Second Order Markov Model

Identification of Secondary Structures Surrounding Plant Polyadenylation Sites

In Silico Prediction of Mrna Poly(a) Sites in Chlamydomonas Reinhardtii

High throughput characterizations of poly(A) site choice in plants.

Alternative polyadenylation and gene expression regulation in plants.

Compilation of mRNA polyadenylation signals in Arabidopsis revealed a new signal element and potential secondary structures.