Abstract:Bug fixing is one of the most important activities in software development and maintenance. Bugs are reported, recorded, and managed in bug tracking systems such as Bugzilla. In general, a bug report contains many fields, such as product, component, severity, priority, fixer, operating system (OS), and platform, which provide important information for the bug triaging and fixing process. Our previous study finds that approximately 80% of bug reports have their fields reassigned and refined at least once, and bugs with reassigned and refined fields take more time to fix than bugs with no reassigned and refined fields. Thus, automatically predicting which bug report fields get reassigned and refined could help developers to save bug fixing time. Considering that a bug report could have multiple field re-assignments and refinements (e.g., the product, component, fixer, and other fields of a bug report can get reassigned and refined), in this paper, we propose a multi-label learning algorithm to predict which bug report fields might be reassigned and refined. We note that the number of bug reports with some types of reassignment and refinement (e.g., bugs whose severity fields gets reassigned and refined) is a small proportion of the whole bug report collection, indicating the class imbalance problem. Thus, we propose imbalanced ML.KNN (Im-ML.KNN), which extends ML.KNN, one of the state-of-the-art multi-label learning algorithms, to achieve better performance. Im-ML.KNN is a composite model that combines 3 multi-label classifiers built using different types of features (i.e., meta, textual, and mixed features). We evaluate our solution on 4 large bug report datasets including OpenOffice, Netbeans, Eclipse, and Mozilla containing a total of 190,558 bug reports. We show that Im-ML.KNN can achieve an average F-measure score of 0.56-0.62. We also compare Im-ML.KNN with other state-of-art methods, such as the method proposed by Lamkanfi et al., ML.KNN, and HOMER-NB. The results show that Im-ML.KNN, on average, improves the average F-measure scores of Lamkanfi et al.'s method, ML.KNN, and HOMER-NB by 119.69%, 9.11%, and 161.08%, respectively.

Bjenet: a Fast and Accurate Software Bug Localization Method in Natural Language Semantic Space

Cross-language Bug Localization.

Just-In-Time Defect Identification and Localization: A Two-Phase Framework.

Exploiting Code Knowledge Graph for Bug Localization Via Bi-directional Attention

BLESER: Bug Localization Based on Enhanced Semantic Retrieval

Automated Configuration Bug Report Prediction Using Text Mining.

Software bug localization based on optimized and ensembled deep learning models

Automated Bug Report Field Reassignment and Refinement Prediction

Augmenting Bug Localization with Part-of-Speech and Invocation

Pre-training Code Representation with Semantic Flow Graph for Effective Bug Localization

Learning Unified Features from Natural and Programming Languages for Locating Buggy Source Code

Control Flow Graph Embedding Based on Multi-Instance Decomposition for Bug Localization.

Aligning Programming Language and Natural Language: Exploring Design Choices in Multi-Modal Transformer-Based Embedding for Bug Localization

Version History, Similar Report, and Structure: Putting Them Together for Improved Bug Localization

A deep multimodal model for bug localization

BugListener: Identifying and Synthesizing Bug Reports from Collaborative Live Chats

Watch out for Version Mismtaching and Data Leakage! A Case Study of Their Influence in Bug Report Based Bug Localization Models

Towards A Novel Approach for Defect Localization Based on Part-of-Speech and Invocation

Bug Localization Via Supervised Topic Modeling

A Comparative Study of Transformer-based Neural Text Representation Techniques on Bug Triaging

BLAZE: Cross-Language and Cross-Project Bug Localization via Dynamic Chunking and Hard Example Learning