Software-defect prediction within and across projects based on improved self-organizing data mining

Qing Zhang,Junhua Ren
DOI: https://doi.org/10.1007/s11227-021-04113-8
IF: 3.3
2021-10-11
The Journal of Supercomputing
Abstract:This paper proposes a new method for software-defect prediction based on self-organizing data mining; this method can establish a causal relationship between software metrics and defects. Defect-prediction models were established for intra-project and cross-project scenarios. For intra-project forecasting, this article establishes a self-organizing data mining model, adding a method of smooth data preprocessing to solve the problem of data imbalance. For cross-project forecasting, this article establishes a self-organizing data mining model, solves the difference between the two by finding a source-project instance with a larger correlation coefficient with the target project, and establishes a defect-prediction model for the selected source-project instance. This paper aims to achieve classification and ranking prediction. The proposed method is tested on public-defect datasets. In the classification-prediction experiment, the precision, F-measure, and AUC evaluation indicators of this method are used. In the ranking-prediction experiment, AAE and ARE evaluation by this method are optimized. The algorithm is found to be an efficient and feasible method for software-defect prediction.
What problem does this paper attempt to address?