Application of Hidden Markov Model in the Recognition of Splicing Sites

夏慧煜,周晴,李衍达
DOI: https://doi.org/10.3321/j.issn:1000-0054.2002.09.022
2002-01-01
Abstract:The recognition of splicing sites is an important step in gene recognition. Since current gene recognition algorithms are mainly considering the global features of coding area, instead of the specific information of the splicing sites, they are usually unable to recognize the splicing sites accurately. Considering that neighboring base pairs of the conserved sequences around splicing sites have some correlations, one order Markov chain was used to model the correlation. Based on this model, a special hidden Markov method for recognition of splicing sites was built. Experimental results show that the description of conserved sequences around splicing sites by HMM is well fit to reality. And the method is good at retrieving the statistical characteristics of the marginal and conditional distribution (transition probabilities) of the conserved sequences. Applying the method to recognize both the true and false splicing sites, the recognition rates are greater than 90%.
What problem does this paper attempt to address?