Predicting Breast Cancer Survivability Using Random Forest and Multivariate Adaptive Regression Splines

Dengju Yao,Jing Yang,Xiaojuan Zhan
DOI: https://doi.org/10.1109/emeit.2011.6023012
2011-01-01
Abstract:In this paper, we propose a hybrid of random forest and multivariate adaptive regression splines algorithms for building a breast cancer survivability prediction model. We use random forest to perform a preliminary screening of variables and to receive a importance ranks. Then, the new dataset is extracted from initial WDBC dataset according to top-k important predictors and is input into the MARS procedure, which is responsible for building interpretable models for predicting breast cancer survivability. The capability of this combination method is evaluated using basic performance measurements (e.g., accuracy, sensitivity, and specificity) along with a 10-fold cross-validation. Experimental results show that the proposed method provides a higher accuracy and a relatively simple model.
What problem does this paper attempt to address?