Cross-Project Issue Classification Based On Ensemble Modeling In A Social Coding World

Yarong Zeng,Yue Yu,Qiang Fan,Xunhui Zhang,Tao Wang,Gang Yin,Huaimin Wang
DOI: https://doi.org/10.1007/978-3-030-04212-7_24
2018-01-01
Abstract:The simplified and deformalized contribution mechanisms in social coding are attracting more and more contributors involved in the collaborative software development. To reduce the burden on the side of project core team, various kinds of automated and intelligent approaches have been proposed based on machine learning and data mining technologies, which would be restricted by the lack of training data. In this paper, we conduct an extensive empirical study of transferring and aggregating reusable models across projects in the context of issue classification, based on a large-scale dataset including 799 open source projects and more than 795,000 issues. We propose a novel cross-project approach which integrate multiple models learned from various source projects to classify target project. We evaluate our approach through conducting comparative experiments with the within-project classification and a typical cross-project method called Bellwether. The results show that our cross-project approach based on ensemble modeling can obtain great performance, which comparable to the within-project classification and performs better than Bellwether.
What problem does this paper attempt to address?