FECS: A Cluster Based Feature Selection Method for Software Fault Prediction with Noises

Wangshu Liu,Shulong Liu,Qing Gu,Xiang Chen,Daoxu Chen
DOI: https://doi.org/10.1109/COMPSAC.2015.66
2015-01-01
Abstract:Noises are inevitable when mining software archives for software fault prediction. Although some researchers have investigated the noise tolerance of existing feature selection methods, few studies focus on proposing new feature selection methods with a certain noise tolerance. To solve this issue, we propose a novel method FECS (FEature Clustering with Selection strategies). This method includes two phases: a feature clustering phase and a feature selection phase with three different heuristic search strategies. During empirical studies, we choose real-world software projects, such as Eclipse and NASA and inject class level and feature level noises simultaneously to imitate noisy datasets. After using classical feature selection methods as the baseline, we confirm the effectiveness of FECS and provide a guideline of using FECS after analyzing the effects of varying either the percentage of selected features or the noise rate.
What problem does this paper attempt to address?