A Machine Learning-Based Static Analysis Warning Prioritization.

Mingshuang Qing,Xiang Feng,Jun Luo,Wanmin Huang,Jingui Zhang,Ping Wang,Yong Fan,Xiuting Ge,Ya Pan
DOI: https://doi.org/10.1109/qrs-c55045.2021.00103
2021-01-01
Abstract:Static analysis tools (SATs) can automatically detect software defects by analyzing the software source code. However, there are a large number of false positives in the warnings generated by SATs, which hinder the usability of SATs. To address this problem, this paper proposes an approach to constructing a machine learning-based warning prioritization model. Our approach explores the 91 warning features and obtains an optimal feature set with 25 features. Subsequently, the machine learning model is applied to learn a warning prioritization model based on the optimal feature set. To evaluate the warning prioritization model, we conduct the experimental evaluation on a real, open source, and large-scale warning dataset. The experimental results show that our approach is better than the random ranking algorithm. On average, in comparison with the random ranking algorithm, our approach reduces the number of false positives by 30.56% – 48.18% when developers check the warning list to find each true defect.
What problem does this paper attempt to address?