DeepLabel: Automated Issue Classification for Issue Tracking Systems

Zhong Li,Minxue Pan,Yu Pei,Tian Zhang,Linzhang Wang,Xuandong Li
DOI: https://doi.org/10.1145/3545258.3545276
2022-01-01
Abstract:With the growth of Issue Tracking Systems, issue reports have become an important data to aid software maintenance and evaluation. Issue classification is one of the most important methods for such purpose, which aims to automatically distinguish issues related to bugs from other issues via machine learning algorithm. However, existing issue classification approaches are still inadequate due to either the incorrect usages of the textual fields of the issues or the ineffective feature representation methods. In this paper, we propose a novel issue classification approach named DeepLabel for achieving advanced issue classification. DeepLabel predicts the issue types by the ensemble of field-specific models that are applied on different textual fields, so as to make the maximum use of the information contained in the textual fields. In addition, DeepLabel adopts Word2Vec combined with attention-based Bi-directional Long Short-Term Memory (ABLSTM) as the feature extractor for the field-specific models in order to effectively extract the semantic information from the textual fields. We conduct an empirical study to evaluate the effectiveness of DeepLabel based on a widely used issue dataset. The results demonstrate that DeepLabel can significantly outperform the state-of-the-art approaches, in which DeepLabel correctly identifies more bug issues (160.1 vs. 140.1) and more non-bug issues (345.7 vs. 325.4) on average compared to the best one existing approach.
What problem does this paper attempt to address?