API Misuse Detection Via Probabilistic Graphical Model

Yunlong Ma,Wentong Tian,Xiang Gao,Hailong Sun,Li
DOI: https://doi.org/10.1145/3650212.3652112
2024-01-01
Abstract:API misuses can cause a range of issues in software development, including program crashes, bugs, and vulnerabilities. Different approaches have been developed to automatically detect API misuses by checking the program against usage rules extracted from extensive codebase or API documents. However, these mined rules may not be precise or complete, leading to high false positive/negative rates. In this paper, we propose a novel solution to this problem by representing the mined API usage rules as a probabilistic graphical model, where each rule's probability value represents its trustworthiness of being correct. Our approach automatically constructs probabilistic usage rules by mining codebase and documents, and aggregating knowledge from different sources. Here, the usage rules obtained from the codebase initialize the probabilistic model, while the knowledge from the documents serves as a supplement for adjusting and complementing the probabilities accordingly. We evaluate our approach on the MuBench benchmark. Experimental results show that our approach achieves 42.0% precision and 54.5% recall, significantly outperforming state-of-the-art approaches.
What problem does this paper attempt to address?