Abstract:In practice, some bugs have more impact than others and thus deserve more immediate attention. Due to tight schedule and limited human resources, developers may not have enough time to inspect all bugs. Thus, they often concentrate on bugs that are highly impactful. In the literature, high-impact bugs are used to refer to the bugs which appear at unexpected time or locations and bring more unexpected effects (i.e., surprise bugs), or break pre-existing functionalities and destroy the user experience (i.e., breakage bugs). Unfortunately, identifying high-impact bugs from thousands of bug reports in a bug tracking system is not an easy feat. Thus, an automated technique that can identify high-impact bug reports can help developers to be aware of them early, rectify them quickly, and minimize the damages they cause. Considering that only a small proportion of bugs are high-impact bugs, the identification of high-impact bug reports is a difficult task. In this paper, we propose an approach to identify high-impact bug reports by leveraging imbalanced learning strategies. We investigate the effectiveness of various variants, each of which combines one particular imbalanced learning strategy and one particular classification algorithm. In particular, we choose four widely used strategies for dealing with imbalanced data and four state-of-the-art text classification algorithms to conduct experiments on four datasets from four different open source projects. We mainly perform an analytical study on two types of high-impact bugs, i.e., surprise bugs and breakage bugs. The results show that different variants have different performances, and the best performing variants SMOTE (synthetic minority over-sampling technique) + KNN (K-nearest neighbours) for surprise bug identification and RUS (random under-sampling) + NB (naive Bayes) for breakage bug identification outperform the F1-scores of the two state-of-the-art approaches by Thung et al. and Garcia and Shihab.

Classifying Crowdsourced Mobile Test Reports with Image Features: an Empirical Study

Clustering Crowdsourced Test Reports of Mobile Applications Using Image Understanding

STIFA: Crowdsourced Mobile Testing Report Selection Based on Text and Image Fusion Analysis

Source Cell-phone Identification Based on Multi-feature Fusion.

Optimizing Prioritization of Crowdsourced Test Reports of Web Applications through Image-to-Text Conversion

Mobile crowdsourced test report prioritization based on text and image understanding

Prioritize Crowdsourced Test Reports via Deep Screenshot Understanding

Multi-objective Test Report Prioritization Using Image Understanding

Mobile App Crowdsourced Test Report Consistency Detection via Deep Image-and-Text Fusion Understanding

Semi-supervised Crowdsourced Test Report Clustering Via Screenshot-Text Binding Rules

Fuzzy Clustering of Crowdsourced Test Reports for Apps.

A Systemic Framework for Crowdsourced Test Report Quality Assessment

Redefining Crowdsourced Test Report Prioritization: An Innovative Approach with Large Language Model

Automated quality assessment for crowdsourced test reports of mobile applications

Automated Quality Assessment for Crowdsourced Test Reports Based on Dependency Parsing

CTRAS: Crowdsourced Test Report Aggregation and Summarization.

Crowdsourced Test Case Generation for Android Applications Via Static Program Analysis

Automatic test report augmentation to assist crowdsourced testing

SemCluster: a Semi-Supervised Clustering Tool for Crowdsourced Test Reports with Deep Image Understanding.

Successes, Challenges, and Rethinking – an Industrial Investigation on Crowdsourced Mobile Application Testing

High-Impact Bug Report Identification with Imbalanced Learning Strategies