ACWRecommender: A Tool for Validating Actionable Warnings with Weak Supervision

Zhipeng Xue,Zhipeng Gao,Xing Hu,Shanping Li
DOI: https://doi.org/10.48550/arXiv.2309.09721
2023-09-18
Abstract:Static analysis tools have gained popularity among developers for finding potential bugs, but their widespread adoption is hindered by the accomnpanying high false alarm rates (up to 90%). To address this challenge, previous studies proposed the concept of actionable warnings, and apply machine-learning methods to distinguish actionable warnings from false alarms. Despite these efforts, our preliminary study suggests that the current methods used to collect actionable warnings are rather shaky and unreliable, resulting in a large proportion of invalid actionable warnings. In this work, we mined 68,274 reversions from Top-500 Github C repositories to create a substantia actionable warning dataset and assigned weak labels to each warning's likelihood of being a real bug. To automatically identify actionable warnings and recommend those with a high probability of being real bugs (AWHB), we propose a two-stage framework called ACWRecommender. In the first stage, our tool use a pre-trained model, i.e., UniXcoder, to identify actionable warnings from a huge number of SA tool's reported warnings. In the second stage, we rerank valid actionable warnings to the top by using weakly supervised learning. Experimental results showed that our tool outperformed several baselines for actionable warning detection (in terms of F1-score) and performed better for AWHB recommendation (in terms of nDCG and MRR). Additionaly, we also performed an in-the-wild evaluation, we manually validated 24 warnings out of 2,197 reported warnings on 10 randomly selected projects, 22 of which were confirmed by developers as real bugs, demonstrating the practical usage of our tool.
Software Engineering
What problem does this paper attempt to address?
The paper aims to address the issue of high false positive rates in warnings generated by static analysis tools and proposes a method to automatically identify and recommend actionable warnings with a high probability of being real bugs (AWHB). Specifically, the paper attempts to solve the problem through the following points: 1. **Dataset Construction**: A large dataset was created by collecting data from the top 500 C projects on GitHub, including 538 actionable warnings and 30,590 false positives. 2. **Weak Supervision Labels**: To improve the effectiveness of actionable warnings, the paper introduces a weak supervision mechanism that assigns a score to each actionable warning, indicating its likelihood of being a real defect, based on semantic matching rules and structural matching rules. 3. **Two-Stage Framework**: - **Coarse-Grained Detection Stage**: Utilizes the pre-trained model UniXcoder to identify which warnings are actionable. - **Fine-Grained Re-ranking Stage**: Further fine-tunes the model using weak supervision learning to prioritize actionable warnings with a high probability of being real defects (AWHB) for developers. 4. **Experimental Evaluation**: The proposed method's effectiveness in identifying actionable warnings and recommending AWHBs was validated through quantitative analysis, and its practicality in reducing the workload for developers in finding real defects was demonstrated through real-world scenario evaluations.