Better Predictors for Issue Lifetime

Mitch Rees-Jones,Matthew Martin,Tim Menzies
DOI: https://doi.org/10.48550/arXiv.1702.07735
2017-02-24
Software Engineering
Abstract:Predicting issue lifetime can help software developers, managers, and stakeholders effectively prioritize work, allocate development resources, and better understand project timelines. Progress had been made on this prediction problem, but prior work has reported low precision and high false alarms. The latest results also use complex models such as random forests that detract from their readability. We solve both issues by using small, readable decision trees (under 20 lines long) and correlation feature selection to predict issue lifetime, achieving high precision and low false alarms (medians of 71% and 13% respectively). We also address the problem of high class imbalance within issue datasets - when local data fails to train a good model, we show that cross-project data can be used in place of the local data. In fact, cross-project data works so well that we argue it should be the default approach for learning predictors for issue lifetime.
What problem does this paper attempt to address?