Operon Prediction Based on SVM

Guo-qing Zhang,Zhi-wei Cao,Qing-ming Luo,Yu-dong Cai,Yi-xue Li
DOI: https://doi.org/10.1016/j.compbiolchem.2006.03.002
IF: 3.737
2006-01-01
Computational Biology and Chemistry
Abstract:The operon is a specific functional organization of genes found in bacterial genomes. Most genes within operons share common features. The support vector machine (SVM) approach is here used to predict operons at the genomic level. Four features were chosen as SVM input vectors: the intergenic distances, the number of common pathways, the number of conserved gene pairs and the mutual information of phylogenetic profiles. The analysis reveals that these common properties are indeed characteristic of the genes within operons and are different from that of non-operonic genes. Jackknife testing indicates that these input feature vectors, employed with RBF kernel SVM, achieve high accuracy. To validate the method, Escherichia coli K12 and Bacillus subtilis were taken as benchmark genomes of known operon structure, and the prediction results in both show that the SVM can detect operon genes in target genomes efficiently and offers a satisfactory balance between sensitivity and specificity.
What problem does this paper attempt to address?