Detecting Financial Bots on the Ethereum Blockchain

Thomas Niedermayer,Pietro Saggese,Bernhard Haslhofer
DOI: https://doi.org/10.1145/3589335.3651959
2024-03-29
Abstract:The integration of bots in Distributed Ledger Technologies (DLTs) fosters efficiency and automation. However, their use is also associated with predatory trading and market manipulation, and can pose threats to system integrity. It is therefore essential to understand the extent of bot deployment in DLTs; despite this, current detection systems are predominantly rule-based and lack flexibility. In this study, we present a novel approach that utilizes machine learning for the detection of financial bots on the Ethereum platform. First, we systematize existing scientific literature and collect anecdotal evidence to establish a taxonomy for financial bots, comprising 7 categories and 24 subcategories. Next, we create a ground-truth dataset consisting of 133 human and 137 bot addresses. Third, we employ both unsupervised and supervised machine learning algorithms to detect bots deployed on Ethereum. The highest-performing clustering algorithm is a Gaussian Mixture Model with an average cluster purity of 82.6%, while the highest-performing model for binary classification is a Random Forest with an accuracy of 83%. Our machine learning-based detection mechanism contributes to understanding the Ethereum ecosystem dynamics by providing additional insights into the current bot landscape.
Cryptography and Security,Machine Learning
What problem does this paper attempt to address?
### What problems does this paper attempt to solve? This paper aims to solve the problem of detecting financial bots on the Ethereum blockchain. Specifically, the researchers are concerned with: 1. **Understanding the types and behaviors of bots**: - By systematizing existing literature and collecting anecdotal evidence, the researchers established a taxonomy of financial bots with 7 categories and 24 sub - categories. This helps to comprehensively understand the financial bot ecosystem on Ethereum. 2. **Creating a real - world dataset**: - The researchers created a ground - truth dataset containing 133 human addresses and 137 bot addresses, providing a basis for subsequent model training and evaluation. 3. **Developing a machine - learning - based detection method**: - The researchers use unsupervised and supervised machine - learning algorithms to detect financial bots on Ethereum. Specifically: - The optimal clustering algorithm is the Gaussian Mixture Model (GMM), with an average cluster purity of 82.6%. - The optimal binary classification model is Random Forest, with an accuracy of 83%. 4. **Identifying key features**: - Using explainable AI techniques, the researchers found that features based on time, frequency, gas price, and gas limit are most important for model performance. ### Main contributions of the paper - **Providing a detailed taxonomy of financial bots**, covering various types of bots and their behavior patterns. - **Creating the first publicly available real - world dataset** for training and evaluating bot - detection models. - **Developing a machine - learning - based detection mechanism** that is more flexible and effective than existing rule - based systems. - **Revealing the impact of bots in the Ethereum ecosystem**, emphasizing the importance of monitoring their presence and impact to ensure system security and stability. Through these efforts, the researchers hope to better understand bot activities in the Ethereum ecosystem and propose a new, machine - learning - based detection method to address the potential risks brought by bots.