Abstract:With the economic development of Ethereum, the frequent security incidents involving smart contracts running on this platform have caused billions of dollars in losses. Consequently, there is a pressing need to identify the vulnerabilities in contracts, while the state-of-the-art (SOTA) detection methods have been limited in this regard as they cannot overcome three challenges at the same time. (i) Meet the requirements of detecting the source code, bytecode, and opcode of contracts simultaneously; (ii) reduce the reliance on manual pre-defined rules/patterns and expert involvement; (iii) assist contract developers in completing the contract lifecycle more safely, e.g., vulnerability repair and abnormal monitoring. With the development of machine learning (ML), using it to detect the contract runtime execution sequences (called instances) has made it possible to address these challenges. However, the lack of datasets with fine-grained sequence labels poses a significant obstacle, given the unreadability of bytecode/opcode. To this end, we propose a method named VulHunter that extracts the instances by traversing the Control Flow Graph built from contract opcodes. Based on the hybrid attention and multi-instance learning mechanisms, VulHunter reasons the instance labels and designs an optional classifier to automatically capture the subtle features of both normal and defective contracts, thereby identifying the vulnerable instances. Then, it combines the symbolic execution to construct and solve symbolic constraints to validate their feasibility. Finally, we implement a prototype of VulHunter with 15K lines of code and compare it with 9 SOTA methods on five open source datasets including 52,042 source codes and 184,289 bytecodes. The results indicate that VulHunter can detect contract vulnerabilities more accurately (90.04% accurate rate and 85.60% F1 score), efficiently (only took 4.4 seconds per contract), and robustly (0% analysis failed rate) than the SOTA methods. Also, it can focus on specific metrics such as precision and recall by employing different baseline models and hyperparameters to meet the various user requirements, e.g., vulnerability discovery and misreport mitigation. More importantly, compared with the previous ML-based arts, it can not only provide classification results, defective contract source code statements, key opcode fragments, and vulnerable execution paths, but also eliminate misreports and facilitate more operations such as vulnerability repair and attack simulation during the contract lifecycle.

Hunting Vulnerable Smart Contracts via Graph Embedding Based Bytecode Matching

Cross-Modality Mutual Learning for Enhancing Smart Contract Vulnerability Detection on Bytecode

Smart Contract Vulnerability Detection Technique: A Survey

A Smart Contract Vulnerability Detection Method Based on Program Dependency Graph

Toward Vulnerability Detection for Ethereum Smart Contracts Using Graph-Matching Network

Optimizing smart contract vulnerability detection via multi-modality code and entropy embedding

Smart Contract Vulnerability Detection: From Pure Neural Network to Interpretable Graph Feature and Expert Pattern Fusion

A New Smart Contract Anomaly Detection Method by Fusing Opcode and Source Code Features for Blockchain Services

An Efficient Code-Embedding-Based Vulnerability Detection Model for Ethereum Smart Contracts

Smart Contract Vulnerability Detection based on Static Analysis and Multi-Objective Search

SmartDagger: a Bytecode-Based Static Analysis Approach for Detecting Cross-Contract Vulnerability

Combining Graph Neural Networks with Expert Knowledge for Smart Contract Vulnerability Detection

VulHunter: Hunting Vulnerable Smart Contracts at EVM bytecode-level via Multiple Instance Learning

Extended Abstract of Combine Sliced Joint Graph with Graph Neural Networks for Smart Contract Vulnerability Detection

Smart Contract Vulnerability Detection Based on Multi Graph Convolutional Neural Networks with Self-attention

An Efficient Smart Contract Vulnerability Detector Based on Semantic Contract Graphs Using Approximate Graph Matching

Particle Swarm Algorithm for Smart Contract Vulnerability Detection Based on Semantic Web

Checking Smart Contracts with Structural Code Embedding

SCVHunter: Smart Contract Vulnerability Detection Based on Heterogeneous Graph Attention Network

Bytecode Similarity Detection of Smart Contract across Optimization Options and Compiler Versions Based on Triplet Network

Graph Neural Networks Enhanced Smart Contract Vulnerability Detection of Educational Blockchain