GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis

Yuqiang Sun,Daoyuan Wu,Yue Xue,Han Liu,Haijun Wang,Zhengzi Xu,Xiaofei Xie,Yang Liu
DOI: https://doi.org/10.1145/3597503.3639117
2024-05-06
Abstract:Smart contracts are prone to various vulnerabilities, leading to substantial financial losses over time. Current analysis tools mainly target vulnerabilities with fixed control or data-flow patterns, such as re-entrancy and integer overflow. However, a recent study on Web3 security bugs revealed that about 80% of these bugs cannot be audited by existing tools due to the lack of domain-specific property description and checking. Given recent advances in Large Language Models (LLMs), it is worth exploring how Generative Pre-training Transformer (GPT) could aid in detecting logicc vulnerabilities. In this paper, we propose GPTScan, the first tool combining GPT with static analysis for smart contract logic vulnerability detection. Instead of relying solely on GPT to identify vulnerabilities, which can lead to high false positives and is limited by GPT's pre-trained knowledge, we utilize GPT as a versatile code understanding tool. By breaking down each logic vulnerability type into scenarios and properties, GPTScan matches candidate vulnerabilities with GPT. To enhance accuracy, GPTScan further instructs GPT to intelligently recognize key variables and statements, which are then validated by static confirmation. Evaluation on diverse datasets with around 400 contract projects and 3K Solidity files shows that GPTScan achieves high precision (over 90%) for token contracts and acceptable precision (57.14%) for large projects like Web3Bugs. It effectively detects ground-truth logic vulnerabilities with a recall of over 70%, including 9 new vulnerabilities missed by human auditors. GPTScan is fast and cost-effective, taking an average of 14.39 seconds and 0.01 USD to scan per thousand lines of Solidity code. Moreover, static confirmation helps GPTScan reduce two-thirds of false positives.
Cryptography and Security,Artificial Intelligence,Software Engineering
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the detection of logical vulnerabilities in smart contracts. Specifically, existing analysis tools mainly target vulnerabilities with fixed control - flow or data - flow patterns (such as re - entry attacks and integer overflows), but these tools are unable to detect approximately 80% of Web3 security vulnerabilities because these vulnerabilities are related to the business logic of smart contracts, and existing tools lack the ability to describe and check domain - specific properties. ### Problem Background Smart contracts are a core component of decentralized finance (DeFi), but due to the existence of various security vulnerabilities, they have already caused billions of dollars in financial losses. Although current security analysis tools can detect some common types of vulnerabilities, they are powerless when it comes to complex vulnerabilities involving business logic. This not only threatens the security of DeFi service providers but also poses a significant risk to the entire DeFi ecosystem and user assets. ### Limitations of Existing Tools Existing static and dynamic analysis tools (such as Slither) are not effective in detecting logical vulnerabilities in smart contracts because they cannot understand the underlying business logic of smart contracts, nor can they model functions or consider the roles of various variables and functions. In addition, recent research shows that about 80% of Web3 security vulnerabilities cannot be audited by existing tools, mainly because these tools lack the ability to describe and check domain - specific properties. ### Solutions Proposed in the Paper To solve the above problems, the paper proposes GPTScan, a tool that combines the Generative Pretrained Transformer (GPT) with static analysis for detecting logical vulnerabilities in smart contracts. The main contributions of GPTScan include: 1. **Combining GPT with Static Analysis**: GPT, as a powerful code - understanding tool, can identify candidate vulnerabilities and match scenarios and properties, while static analysis is used to verify key variables and statements, thereby improving the accuracy of detection. 2. **Multi - dimensional Filtering**: Through the multi - dimensional filtering process, GPTScan can effectively narrow the range of candidate functions, avoid directly inputting all files into GPT, thereby reducing costs and improving efficiency. 3. **Efficient and Economical**: GPTScan uses the GPT - 3.5 - turbo model, which is 20 times less costly than the advanced GPT - 4 model, and can scan every 1,000 lines of Solidity code in an average of 14.39 seconds, with a cost of only $0.01 per scan. ### Summary The paper aims to solve the problem of difficult - to - detect logical vulnerabilities in smart contracts and proposes a new method that combines GPT with static analysis, significantly improving the accuracy and efficiency of detection. Through practical tests, GPTScan performs well on multiple datasets and can effectively detect known and unknown logical vulnerabilities while reducing the false positive rate.