Smart Contract Vulnerability Detection with Self-Ensemble Pre-Trained Language Models

Chaofan Dai,Huahua Ding,Wubin Ma,Yahui Wu
DOI: https://doi.org/10.1109/cits61189.2024.10607982
2024-01-01
Abstract:Smart contracts are decentralized applications de-ployed extensively on blockchain. Due to their economic nature, vulnerabilities in smart contracts can lead to potential significant economic and property losses, disrupting the stable ecosystem of Ethereum. Therefore, the detection of smart contract vul-nerabilities is of paramount importance. Current mainstream methods for smart contract vulnerability detection rely on heuris-tic algorithms based on manual design, which lack reusability across different application scenarios, are time-consuming, and exhibit suboptimal accuracy. To enhance vulnerability detection effectiveness, a method tailored for timestamp vulnerabilities in smart contracts is proposed, named SESCD, based on self-ensembling pretraining. The proposed approach first identifies potential data propagation paths for timestamp vulnerabilities, prunes them, and leverages self-ensembling pretrained models to learn about these propagation paths. Furthermore, the training process is optimized through knowledge distillation to improve the model's ability to detect whether smart contracts contain timestamp vulnerabilities. SESCD demonstrates superior vulner-ability detection and generalization capabilities, alleviating performance instability issues caused by insufficient training data. To validate the effectiveness of SESCD, comparative experiments are conducted on a real-world dataset of smart contracts against 13 mainstream smart contract vulnerability detection methods. Experimental results show that SESCD achieves precision, recall, and F1 scores of 0.91, 0.93, and 0.92 respectively in detecting timestamp vulnerabilities. Compared to the 13 mainstream methods, SESCD exhibits an average relative improvement of 28%, 30%, and 30%, significantly enhancing the detection capabilities of timestamp vulnerabilities.
What problem does this paper attempt to address?