Code Churn: A Neglected Metric in Effort-Aware Just-in-Time Defect Prediction

Jinping Liu,Yuming Zhou,Yibiao Yang,Hongmin Lu,Baowen Xu
DOI: https://doi.org/10.1109/esem.2017.8
2017-01-01
Abstract:Background: An increasing research effort has devoted to just-in-time (JIT) defect prediction. A recent study by Yang et al. at FSE'16 leveraged individual change metrics to build unsupervised JIT defect prediction model. They found that many unsupervised models performed similarly to or better than the state-of-the-art supervised models in effort-aware JIT defect prediction. Goal: In Yang et al.'s study, code churn (i.e. the change size of a code change) was neglected when building unsupervised defect prediction models. In this study, we aim to investigate the effectiveness of code churn based unsupervised defect prediction model in effort-aware JIT defect prediction. Methods: Consistent with Yang et al.'s work, we first use code churn to build a code churn based unsupervised model (CCUM). Then, we evaluate the prediction performance of CCUM against the state-of-the-art supervised and unsupervised models under the following three prediction settings: cross-validation, time-wise cross-validation, and cross-project prediction. Results: In our experiment, we compare CCUM against the state-of-the-art supervised and unsupervised JIT defect prediction models. Based on six open-source projects, our experimental results show that CCUM performs better than all the prior supervised and unsupervised models. Conclusions: The result suggests that future JIT defect prediction studies should use CCUM as a baseline model for comparison when a novel model is proposed.
What problem does this paper attempt to address?