Weighted Neighborhood Classifier for the Classification of Imbalanced Tumor Dataset

SHU-LIN WANG,Xueling Li,Jun-Feng Xia,Xiao-Ping Zhang
DOI: https://doi.org/10.1142/S0218126610006232
2011-01-01
Journal of Circuits Systems and Computers
Abstract:Machine learning is widely applied to gene expression profiles based molecular tumor classification, but sample imbalance problem is often overlooked. This paper proposed a subclass-weighted neighborhood classifier to address the imbalanced sample set problem and a novel neighborhood rough set model to select informative genes for classification performance improvement. Experiments on three publicly available tumor datasets demonstrated that the proposed method is obviously effective on imbalanced dataset with obscure boundary between two subtypes and informative gene selection and it can achieve higher cross-validation accuracy with much fewer tumor-related genes.
What problem does this paper attempt to address?