SGL-SVM: A novel method for tumor classification via support vector machine with sparse group Lasso

Yanhao Huo,Lihui Xin,Chuanze Kang,Minghui Wang,Qin Ma,Bin Yu
DOI: https://doi.org/10.1016/j.jtbi.2019.110098
IF: 2.405
2020-01-01
Journal of Theoretical Biology
Abstract:At present, with the in-depth study of gene expression data, the significant role of tumor classification in clinical medicine has become more apparent. In particular, the sparse characteristics of gene expression data within and between groups. Therefore, this paper focuses on the study of tumor classification based on the sparsity characteristics of genes. On this basis, we propose a new method of tumor classification-Sparse Group Lasso (least absolute shrinkage and selection operator) and Support Vector Machine (SGL-SVM). Firstly, the primary selection of feature genes is performed on the normalized tumor datasets using the Kruskal-Wallis rank sum test. Secondly, using a sparse group Lasso for further selection, and finally, the support vector machine serves as a classifier for classification. We validate proposed method on microarray and NGS datasets respectively. Formerly, on three two-class and five multi-class microarray datasets it is tested by 10-fold cross-validation and compared with other three classifiers. SGL-SVM is then applied on BRCA and GBM datasets and tested by 5-fold cross-validation. Satisfactory accuracy is obtained by above experiments and compared with other proposed methods. The experimental results show that the proposed method achieves a higher classification accuracy and selects fewer feature genes, which can be widely applied in classification for high-dimensional and small-sample tumor datasets. The source code and all datasets are available at https://github.com/QUST-AIBBDRC/SGL-SVM/. (C) 2019 Elsevier Ltd. All rights reserved.
What problem does this paper attempt to address?