Abstract:This paper provides a systematic analysis of the opportunities, challenges, and potential solutions of harnessing Large Language Models (LLMs) such as GPT-4 to dig out vulnerabilities within smart contracts based on our ongoing research. For the task of smart contract vulnerability detection, achieving practical usability hinges on identifying as many true vulnerabilities as possible while minimizing the number of false positives. Nonetheless, our empirical study reveals contradictory yet interesting findings: generating more answers with higher randomness largely boosts the likelihood of producing a correct answer but inevitably leads to a higher number of false positives. To mitigate this tension, we propose an adversarial framework dubbed GPTLens that breaks the conventional one-stage detection into two synergistic stages $-$ generation and discrimination, for progressive detection and refinement, wherein the LLM plays dual roles, i.e., auditor and critic, respectively. The goal of auditor is to yield a broad spectrum of vulnerabilities with the hope of encompassing the correct answer, whereas the goal of critic that evaluates the validity of identified vulnerabilities is to minimize the number of false positives. Experimental results and illustrative examples demonstrate that auditor and critic work together harmoniously to yield pronounced improvements over the conventional one-stage detection. GPTLens is intuitive, strategic, and entirely LLM-driven without relying on specialist expertise in smart contracts, showcasing its methodical generality and potential to detect a broad spectrum of vulnerabilities. Our code is available at: <a class="link-external link-https" href="https://github.com/git-disl/GPTLens" rel="external noopener nofollow">this https URL</a>.

What problem does this paper attempt to address?

The paper aims to address several key issues in smart contract vulnerability detection: 1. **Improving Detection Accuracy**: Existing smart contract auditing tools often rely on fixed pattern detectors designed by experts, which limits their ability to detect unknown or unclassified vulnerabilities. Large Language Models (LLMs) can describe any type of vulnerability and have the capability to detect a wider range of vulnerabilities. 2. **Reducing False Positives**: Current LLMs generate many false positives when producing a large number of potential vulnerabilities, leading to a significant amount of manual verification work. The paper proposes a new framework, GPTL ENS, which mitigates this issue by decomposing the traditional single-stage detection into two stages: generation and discrimination. In the generation stage, multiple auditors generate possible vulnerabilities and their reasoning, while in the discrimination stage, critics evaluate the validity of these vulnerabilities, thereby reducing false positives. 3. **Enhancing Interpretability and Generality**: LLMs can not only detect vulnerabilities but also provide intermediate reasoning processes, improving transparency and credibility. Additionally, compared to traditional predefined vulnerability category detection, the open-ended prompting approach allows LLMs to identify a broader range of vulnerability types. The main contribution of the paper is the proposal of an innovative two-stage adversarial framework, GPTL ENS, which effectively improves the accuracy and practicality of smart contract vulnerability detection by separating the generation and discrimination processes, while also reducing reliance on domain-specific knowledge. Experimental results show that compared to traditional single-stage detection methods, GPTL ENS significantly enhances performance in detecting real vulnerabilities.

Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives

Smart Contract Vulnerability Detection: The Role of Large Language Model (LLM)

Detect Llama -- Finding Vulnerabilities in Smart Contracts using Large Language Models

Detection Made Easy: Potentials of Large Language Models for Solidity Vulnerabilities

SmartLLMSentry: A Comprehensive LLM Based Smart Contract Vulnerability Detection Framework

Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities

Do you still need a manual smart contract audit?

Retrieval Augmented Generation Integrated Large Language Models in Smart Contract Vulnerability Detection

Large Language Model for Vulnerability Detection: Emerging Results and Future Directions

GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis

Can Large Language Models Find And Fix Vulnerable Software?

When ChatGPT Meets Smart Contract Vulnerability Detection: How Far Are We?

Automated Smart Contract Vulnerability Detection using Fine-tuned Large Language Models

Smart-LLaMA: Two-Stage Post-Training of Large Language Models for Smart Contract Vulnerability Detection and Explanation

VDDL: A Deep Learning-Based Vulnerability Detection Model for Smart Contracts.

Large Language Models for Secure Code Assessment: A Multi-Language Empirical Study

LLM-SmartAudit: Advanced Smart Contract Vulnerability Detection

LProtector: An LLM-driven Vulnerability Detection System

Harnessing Large Language Models for Software Vulnerability Detection: A Comprehensive Benchmarking Study