PREDICTING SPLICE JUNCTION SITE IN DNA SEQUENCES WITH BAYESIAN NETWORK

李骜,王涛,冯焕清,王明会
DOI: https://doi.org/10.3321/j.issn:1000-6737.2003.04.017
2003-01-01
ACTA BIOPHYSICA SINICA
Abstract:Two new models for predicting the splice junction in eukaryotic DNA sequences were developed by exploiting Bayesian network, one for donor site and the other for acceptor site. The topology structures and the upstream (downstream) nodes of these two models were optimized in consideration of the biological characters of acceptor site and donor site. Both of the models were trained by a ML (maximum likelihood) algorithm for Bayesian network learning, then the testing DNA sequence data were feed into the model and a 10-fold cross validation method was used to evaluate the performance of prediction. The experimental results show that in average, the sensitivity of acceptor site detection was 92.5% and the specificity was 94.0%, the sensitivity of donor site detection was 92.3% and the specificity was 93.5%. These results proved that the models were better than the models based on independent matrix and conditional probability matrix, as well as the hidden Markov model for splice junction site detection in some ways. These conclusions indicate that the optimized Bayesian network models are powerful tools for splice junction detection in eukaryotic genes.
What problem does this paper attempt to address?