Prediction of Exons using Normalized Probability Parameters derived from Statistical Analysis of Coding Sequences

Ponraj Ponraj,A. Nivetha,Sherlyn Jemimah,A. Mugilan
DOI: https://doi.org/10.1109/ComPE49325.2020.9200154
2020-07-01
Abstract:Exon prediction is a major goal in computational gene prediction in eukaryotes, and is often undertaken using neural networks and generalized Hidden Markov Models (HMMs).Examples include GENSCAN, TWINSAN, N-SCAN, AUGUSTUS, etc. Some methods rely on the alignment of the sequences with either an informant genome (ROSETTA) or EST and cDNA sequences (N_SCAN_EST) to provide better predictions. The existing methods try to account for both short-range and long-range base-base interactions, thus affecting the accuracy of the prediction parameters. We propose a new approach, NPrP, which uses statistical analysis of coding and non-coding sequences to generate normalized probability parameters (NPrP). Our initial results show a sensitivity of 0.71 to a specificity of 0.71. By modelling DNA interactions using deviation parameters, we can calculate a score at each position and use the scores to predict exon regions.
Biology,Computer Science
What problem does this paper attempt to address?