Context-based statement-level vulnerability localization
Thu-Trang Nguyen,Hieu Dinh Vo
DOI: https://doi.org/10.2139/ssrn.4459266
IF: 3.9
2024-01-22
Information and Software Technology
Abstract:Context: The number of attacks exploring software vulnerabilities has dramatically increased, which has caused various severe damages. Thus, early and accurately detecting vulnerabilities becomes essential to guarantee software quality and prevent the systems from malicious attacks. Multiple automated vulnerability detection approaches have been proposed and obtained promising results. However, most studies detect vulnerabilities at a coarse-grained, i.e., file or method level. Thus, developers still have to spend significant investigation efforts on localizing vulnerable statements. Objective: In this paper, we introduce COSTA , a novel context-based approach to localize vulnerable statements. Method: In particular, given a vulnerable function, COSTA identifies vulnerable statements based on their suspiciousness scores. Specifically, the suspiciousness of each statement is measured according to its semantics captured by four contexts, including operation context, dependence context, surrounding context , and vulnerability type . Results: Our experimental results on a large vulnerability dataset show that COSTA outperforms the state-of-the-art approaches up to 96% in F1-score and 167% in Accuracy. COSTA also surpasses these approaches up to two times in Top-1 Accuracy. Especially, COSTA obtains about 80% at Top-3 Recall . In other words, developers can find about 80% of the vulnerable statements by investigating only three first-ranked statements in each function. Conclusion: COSTA effectively addresses the challenge of statement-level vulnerability localization by leveraging multiple contextual features. Our experimental results show that COSTA outperforms existing state-of-the-art approaches. With the ability to accurately and efficiently identify vulnerable statements, developers can better allocate their investigation efforts, reduce the risk of potential security threats, and ensure software quality and security in real-world applications.
computer science, information systems, software engineering