Address Clustering Heuristics for Ethereum

Friedhelm Victor
DOI: https://doi.org/10.1007/978-3-030-51280-4_33
2020-01-01
Abstract:For many years, address clustering for the identification of entities has been the basis for a variety of graph-based investigations of the Bitcoin blockchain and its derivatives. Especially in the field of fraud detection it has proven to be useful. With the popularization and increasing use of alternative blockchains, the question arises how to recognize entities in these new systems. Currently, there are no heuristics that can directly be applied to Ethereum’s account balance model. This drawback also applies to other smart contract platforms like EOS or NEO, for which previous transaction network analyses have been limited to address graphs. In this paper, we show how addresses can be clustered in Ethereum, yielding entities that are likely in control of multiple addresses. We propose heuristics that exploit patterns related to deposit addresses, multiple participation in airdrops and token authorization mechanisms. We quantify the applicability of each individual heuristic over the first 4 years of the Ethereum blockchain and illustrate identified entities in a sample token network. Our results show that we can cluster 17.9% of all active externally owned account addresses, indicating that there are more than 340,000 entities that are likely in control of multiple addresses. Comparing the heuristics, we conclude that the deposit address heuristic is currently the most effective approach.
What problem does this paper attempt to address?