Phish2vec: A Temporal and Heterogeneous Network Embedding Approach for Detecting Phishing Scams on Ethereum

Zhutian Lin,Xi Xiao,Guangwu Hu,Bin Zhang,Qixu Liu,Xiapu Luo
DOI: https://doi.org/10.1109/secon58729.2023.10287480
2023-01-01
Abstract:The exponential growth of Ethereum transactions has resulted in a significant increase in phishing scams, leading to substantial financial losses in recent years. Current machine/deep learning-based approaches for classification have been found to be inadequate for large-scale and label-imbalanced Ethereum scenarios. To address this issue, we propose Phish2vec, a novel network embedding approach that takes into account the transaction temporality and heterogeneity in detecting phishing scams on Ethereum. Our approach begins by producing a transaction sub-network through data collection and preprocessing, which includes a novel Statistics-Based Sampling (SBS) method to address label leakage. To generate sequences that contain more comprehensive information, we then utilize two different types of sequences generators: Temporal-based Sequences Generator (TSG) and Heterogeneous-based Sequences Generator (HSG). By concatenating the sequences generated by TSG and HSG together, and feeding them into Word2vec and Fully Connected neural network (FC), our approach can identify phishing accounts with an Fl-score as high as 82.05%, which significantly outperforms classic schemes such as DeepWalk (67.29%), Trans2vec (74.78%), and Node2vec (70.91%).
What problem does this paper attempt to address?