ViolationTracker: Building Precise Histories for Static Analysis Violations

Ping Yu,Yijian Wu,Xin Peng,Jiahan Peng,Jian Zhang,Peicheng Xie,Wenyun Zhao
DOI: https://doi.org/10.1109/icse48619.2023.00171
2023-01-01
Abstract:Automatic static analysis tools (ASATs) detect source code violations to static analysis rules and are usually used as a guard for source code quality. The adoption of ASATs, however, is often challenged because of several problems such as a large number of false alarms, invalid rule priorities, and inappropriate rule configurations. Research has shown that tracking the history of the violations is a promising way to solve the above problems because the facts of violation fixing may reflect the developers' subjective expectations on the violation detection results. Precisely identifying the revisions that induce or fix a violation is however challenging because of the imprecise matching of violations between code revisions and ignorance of merge commits in the maintenance history. In this paper, we propose ViolationTracker, an approach to precisely matching the violation instances between adjacent revisions and building the life cycle of violations with the identification of inducing, fixing, deleting, and reopening of each violation case. The approach employs code entity anchoring heuristics for violation matching and considers merge commits that used to be ignored in existing research. We evaluate ViolationTracker with a manually-validated dataset that consists of 500 violation instances and 158 threads of 30 violation cases with detailed evolution history from open-source projects. Violation Tracker achieves over 93 % precision and 98 % recall on violation matching, outperforming the state-of-the-art approach, and 99.4 % precision on rebuilding the histories of violation cases. We also show that ViolationTracker is useful to identify actionable violations. A preliminary empirical study reveals the possibility to prioritize static analysis rules according to further analysis on the actionable rates of the rules.
What problem does this paper attempt to address?