Kernel Techniques in Support Vector Machines for Classification of Biological Data

Hao Jiang,Wai-Ki Ching,Zeyu Zheng
DOI: https://doi.org/10.5815/ijitcs.2011.02.01
2011-01-01
International Journal of Information Technology and Computer Science
Abstract:In this paper, we consider the problem of protein classification, which is a important and hot topic in bioinformatics.We propose a novel kernel based on the K-Spectrum Kernel by incorporating physico-chemical and biological properties of amino acids as well as the motif information for the captured protein classification problem.Similarity matrix is constructed based on an AAindex2 substitution matrix which measures the amino acid pair distance.Together with the motif content posing importance on the protein sequences, a new kernel is then constructed.We adopt the Eigen-matrix translation techniques for improving the classification accuracy.Experimental results indicate that the string-based kernel in conjunction with SVM classifier performs significantly better than the traditional spectrum kernel method.Furthermore, numerical examples also confirm the use of the Eigenmatrix translation techniques as general strategy.
What problem does this paper attempt to address?