Imbalanced learning for clinical survival group prediction of brain tumor patients

mu zhou,lawrence o hall,dmitry b goldgof,robert j gillies,robert a gatenby
DOI: https://doi.org/10.1117/12.2075606
2015-01-01
Abstract:Accurate computer-aided prediction of survival time for brain tumor patients requires a thorough understanding of clinical data, since it provides useful prior knowledge for learning models. However, to simplify the learning process, traditional settings often assume datasets with equally distributed classes, which clearly does not reflect a typical distribution. In this paper, we investigate the problem of mining knowledge from an imbalanced dataset (i.e., a skewed distribution) to predict survival time. In particular, we propose an algorithmic framework to predict survival groups of brain tumor patients using multi-modality MRI data. Both an imbalanced distribution and classifier design are jointly considered: 1) We used the Synthetic Minority Over-sampling Technique to compensate for the imbalanced distribution; 2) A predictive linear regression model was adopted to learn a pair of class-specific dictionaries, which were derived from reformulated balanced data. We tested the proposed framework using a dataset of 42 patients with Glioblastoma Multiforme (GBM) tumors whose scans were obtained from the cancer genome atlas (TCGA). Experimental results showed that the proposed method achieved 95.24% accuracy.
What problem does this paper attempt to address?