Smart Contract Bytecode Similarity Detection Based on Self-supervised Learning

Zhongyuan Qin,Yadong Shi,Hui Zuo,Xuxian Jiang
DOI: https://doi.org/10.1109/ICSIP57908.2023.10271080
2023-07-08
Abstract:Code similarity detection is crucial for conducting security audits on smart contracts. It enables important audit tasks such as vulnerability mining and malicious contract detection based on code similarity. However, as the majority of smart contracts on Ethereum do not share their source code, detecting code similarity based on bytecode is of great significance. This paper proposes a method for self-supervised learning-based bytecode similarity detection, which obtains the control flow graph (CFG) from the bytecode in a symbolic way by simulating the execution of all instructions on the Ethereum Virtual Machine. Similarity detection is then performed at the function level. The proposed method utilizes a self-supervised model to obtain features from the bytecode and combines them with the features obtained from the stack when generating CFG to detect bytecode similarity. Experimental results demonstrate that the proposed method outperforms the baseline in terms of performance.
Computer Science
What problem does this paper attempt to address?