LProtector: An LLM-driven Vulnerability Detection System

Ze Sheng,Fenghua Wu,Xiangwu Zuo,Chao Li,Yuxin Qiao
2024-11-10
Abstract:This paper presents LProtector, an automated vulnerability detection system for C/C++ codebases driven by the large language model (LLM) GPT-4o and Retrieval-Augmented Generation (RAG). As software complexity grows, traditional methods face challenges in detecting vulnerabilities effectively. LProtector leverages GPT-4o's powerful code comprehension and generation capabilities to perform binary classification and identify vulnerabilities within target codebases. We conducted experiments on the Big-Vul dataset, showing that LProtector outperforms two state-of-the-art baselines in terms of F1 score, demonstrating the potential of integrating LLMs with vulnerability detection.
Cryptography and Security,Artificial Intelligence
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the security issues of large - scale software systems and frameworks. With the development of technology, these issues have become increasingly serious. As software complexity increases, vulnerability detection becomes more challenging. Although traditional machine - learning methods have been applied in network security for a long time, no significant breakthrough has been made. With the rise of large - language models (LLMs), this situation seems to have reached a turning point. The powerful code - understanding and - generation capabilities of LLMs make it possible to implement a fully - automatic vulnerability - detection system. Specifically, the paper introduces LProtector, an automatic vulnerability - detection system based on GPT - 4o and retrieval - augmented generation (RAG) for C/C++ codebases. LProtector identifies vulnerabilities in the target codebase through binary classification. To evaluate its effectiveness, the researchers conducted experiments on the Big - Vul dataset. The results show that LProtector outperforms two state - of - the - art baseline methods in terms of the F1 score, demonstrating the potential of combining LLMs with vulnerability detection.