Influence of missing values replacement on disease classification analysis based on gene expression profiles

Dong Wang,Zheng Guo,Xia Li,Yingli Lv,Jing Zhu,Chenguang Wang
DOI: https://doi.org/10.3321/j.issn:1002-0470.2006.05.012
2006-01-01
Abstract:In this article, two different missing value treatments (replacing with zeros, KNN estimations) combined with three kinds of classifiers, support vector machine (SVM), K-nearest neighbor (KNN) and decision tree (DT), were used to evaluate the effect on four data sets. The results showed that when the missing value rate was less than 5%, enough genes for classification will remain arid quite high classification accuracy can be still got.
What problem does this paper attempt to address?