The Role of Attribute Ranker Using Classification for Software Defect-Prone Data-sets Model: an Empirical Comparative Study

Maaz Rasheed Malik,Liu Yining,Salahuddin Shaikh
DOI: https://doi.org/10.1109/syscon47679.2020.9275860
2020-01-01
Abstract:Feature selection, is an issue firmly identified with size decrease of data-sets model. The target of feature selection is to recognize features in the data-set as significant, and dispose of some other feature as unimportant and repetitive data. Since feature selection diminishes the dimensionality of the data-sets model, it holds out the probability of increasingly successful and quick activity of data mining algorithm which can be worked quicker and all the more adequately by utilizing feature selection. In this research paper we will investigate feature extraction Principal Component Analysis (PCA) with attribute ranker search technique. In practice, Principal Component Analysis with attribute ranker search strategy isn’t just used to improve extra storage space or the computational accuracy and efficiency of the classification algorithm, however can likewise enhance the prescient presentation by diminishing the scourge of dimensionality — particularly on the off chance that we are working with software defect-prone or non-defected data-sets models. We have used 10 datasets models, these datasets models basically REPOSITORY model of NASA which contain binary class defected and non-defected datasets models. We have also used 6 classifiers for comparatively analysis between objective model and real datasets model. We illustrated the comparatively analysis between PCA Ranker Search method with No-PCA. We have also compared classifiers efficiency with each other. The most efficient and useful classifier are the Bagging and Multilayer Perceptron at all attribute ranking search method. But the comparatively analysis between the classifiers that Naïve bayes and MultiLayer Perceptron have well increased the correctly classified instances percentage in overall software fault forecast. One more comparison between the PCA Ranker Search Method and No-PCA that is attribute ranker search method is really good for increasing accuracy and efficiency for software fault forecast dataset model as compare the no-PCA method.
What problem does this paper attempt to address?