Support Vector Machine Based on Localized Multiple Kernel Learning in Pre-Microrna Classification

Hengyue Shi,Weifeng Wang,Peng Wu,Dong Wang
DOI: https://doi.org/10.1109/icecce49384.2020.9179258
2020-01-01
Abstract:MicroRNAs are conserved short nucleotide sequences that play an important role in gene transcriptional regulation. The expression of microRNAs varies depending on the cell environment and number, and many genomic fragments can be folded into pseudo-microRNA hairpin structures. Therefore, detecting true microRNAs in a genome is a challenging task. Most methods use sequence and secondary structure information features for classification. However, samples of microRNA datasets may be composed of multiple distributions. Existing methods focus on proposing new microRNA features but not on the distribution characteristics of the dataset itself, which may make the classification results of the model too subjective, over-learning redundant features and reducing classification performance, while the Localized Multiple Kernel Learning method can fully exploit the local distribution of data. This paper aims to establish a general microRNA classification model between species using the support vector machine method based on Localized Multiple Kernel Learning, which is more versatile and has higher classification accuracy. On the microRNA dataset, the microRNA prediction model was established according to the algorithm of this paper. Compared with other existing prediction methods, the model has higher accuracy. Experiments show that the idea of constructing a general microRNA prediction model is meaningful. This model provides a reference for further validation of possible microRNAs.
What problem does this paper attempt to address?