Effective Vulnerable Function Identification Based on CVE Description Empowered by Large Language Models

Yulun Wu,Ming Wen,Zeliang Yu,Xiaochen Guo,Hai Jin
DOI: https://doi.org/10.1145/3691620.3695013
2024-01-01
Abstract:Open-source software (OSS) has profoundly transformed the software development paradigm by facilitating effortless code reuse. However, in recent years, there has been an alarming increase in disclosed vulnerabilities within OSS, posing significant security risks to downstream users. Therefore, analyzing existing vulnerabilities and precisely assessing their threats to downstream applications become pivotal. Plenty of efforts have been made recently towards this problem, such as vulnerability reachability analysis and vulnerability reproduction. The key to these tasks is identifying the vulnerable function (i.e., the function where the root cause of a vulnerability resides). However, public vulnerability datasets (e.g., NVD) rarely include this information as pinpointing the exact vulnerable functions remains to be a longstanding challenge. Existing methods mainly detect vulnerable functions based on vulnerability patches or Proof-of-Concept (PoC). However, such methods face significant limitations due to data availability and the requirement for extensive manual efforts, thus hindering scalability. To address this issue, we propose a novel approach VFFinder that localizes vulnerable functions based on Common Vulnerabilities and Exposures (CVE) descriptions and the corresponding source code utilizing Large Language Models (LLMs). Specifically, VFFinder adopts a customized in-context learning (ICL) approach based on CVE description patterns to enable LLM to extract key entities. It then performs priority matching with the source code to localize vulnerable functions. We assess the performance of VFFinder on 75 large open-source projects. The results demonstrate that VFFinder surpasses existing baselines significantly. Notably, the Top-1 and MRR metrics have been improved substantially, averaging 4.25X and 2.37X respectively. We also integrate VFFinder with Software Composition Analysis (SCA) tools, and the results show that our tool can reduce the false positive rates of existing SCA tools significantly.
What problem does this paper attempt to address?