Foliage: Nourishing Evolving Software by Characterizing and Clustering Field Bugs

Zhanyao Lei,Yixiong Chen,Mingyuan Xia,Zhengwei Qi
DOI: https://doi.org/10.1145/3650212.3680363
2024-01-01
Abstract:Modern programs, characterized by their complex functionalities, high integration, and rapid iteration cycles, are prone to errors. This complexity poses challenges in program analysis and software testing, making it difficult to achieve comprehensive bug coverage during the development phase. As a result, many bugs are only discovered during the software’s production phase. Tracking and understanding these field bugs is essential but challenging: the uploaded field error reports are extensive, and trivial yet high-frequency bugs can overshadow important low-frequency bugs. Additionally, application codebases evolve rapidly, causing a single bug to produce varied exceptions and stack traces across different code releases. In this paper, we introduce Foliage, a bug tracking and clustering toolchain designed to trace and characterize field bugs in JavaScript applications, aiding developers in locating and fixing these bugs. To address the challenges of efficiently tracking and analyzing the dynamic and complex nature of software bugs, Foliage proposes an error message enhancement technique. Foliage also introduces the verbal-characteristic-based clustering technique, along with three evaluation metrics for bug clustering: V-measure, cardinality bias, and hit rate. The results show that Foliage’s verbal-characteristic-based bug clustering outperforms previous bug clustering approaches by an average of 31.1% across these three metrics. We present an empirical study of Foliage applied to a complex real-world application over a two-year production period, capturing over 250,000 error reports and clustering them into 132 unique bugs. Finally, we open-source a bug dataset consisting of real and labeled error reports, which can be used to benchmark bug clustering techniques.
What problem does this paper attempt to address?