Classification Method Based on SVM for Human Gene Sequences

LIU Jian-li,LIU Chun-nian
2008-01-01
Abstract:In order to determine whether a given DNA sequence is a intergenic or a gene region,training features are extracted from DNA sequences based on linguistics method,and gene and intergenic regions of 22~# chromosome are classified with the Support Vector Machine(SVM)technique.The prediction accuracy of classifiers can reach more than 85% without any information in biologic field.Correspondingly,although Binary Logistic Regression(BLR)technique can get also relatively high classification accuracy,the training time of SVM is greatly preferable to BLR's.
What problem does this paper attempt to address?