How Well Do Change Sequences Predict Defects? Sequence Learning from Software Changes.

Ming Wen,Rongxin Wu,Shing-Chi Cheung
DOI: https://doi.org/10.1109/tse.2018.2876256
IF: 7.4
2020-01-01
IEEE Transactions on Software Engineering
Abstract:Software defect prediction, which aims to identify defective modules, can assist developers in finding bugs and prioritizing limited quality assurance resources. Various features to build defect prediction models have been proposed and evaluated. Among them, process metrics are one important category. Yet, existing process metrics are mainly encoded manually from change histories and ignore the sequential information arising from the changes during software evolution. Are the change sequences derived from such information useful to characterize buggy program modules? How can we leverage such sequences to build good defect prediction models? Unlike traditional process metrics used for existing defect prediction models, change sequences are mostly vectors of variable length. This makes it difficult to apply such sequences directly in prediction models that are driven by conventional classifiers. To resolve this challenge, we utilize Recurrent Neural Network (RNN), which is a deep learning technique, to encode features from sequence data automatically. In this paper, we propose a novel approach called Fences, which extracts six types of change sequences covering different aspects of software changes via fine-grained change analysis. It approaches defects prediction by mapping it to a sequence labeling problem solvable by RNN. Our evaluations on 10 open source projects show that Fences can predict defects with high performance. In particular, our approach achieves an average F-measure of 0.657, which improves the prediction models built on traditional metrics significantly. The improvements vary from 31.6 to 46.8 percent on average. In terms of AUC, Fences achieves an average value of 0.892, and the improvements over baselines vary from 4.2 to 16.1 percent. Fences also outperforms the state-of-the-art technique which learns semantic features automatically from static code via deep learning.
What problem does this paper attempt to address?