Classifying osteosarcoma patients using machine learning approaches

Zhi Li,S. Soroushmehr,Yi-Yang Hua,Min Mao,Yunping Qiu,K. Najarian
DOI: https://doi.org/10.1109/EMBC.2017.8036768
2017-07-01
Abstract:Metabolomic data analysis presents a unique opportunity to advance our understanding of osteosarcoma, a common bone malignancy for which genomic and proteomic studies have enjoyed limited success. One of the major goals of metabolomic studies is to classify osteosarcoma in early stages, which is required for metastasectomy treatment. In this paper we subject our metabolomic data on osteosarcoma patients collected by the SJTU team to three classification methods: logistic regression, support vector machine (SVM) and random forest (RF). The performances are evaluated and compared using receiver operating characteristic curves. All three classifiers are successful in distinguishing between healthy control and tumor cases, with random forest outperforming the other two for cross-validation in training set (accuracy rate for logistic regression, support vector machine and random forest are 88%, 90% and 97% respectively). Random forest achieved overall accuracy rate of 95% with 0.99 AUC on testing set.
Medicine
What problem does this paper attempt to address?