Performance Comparison Analysis of ArangoDB, MySQL, and Neo4j: An Experimental Study of Querying Connected Data

Johan Sandell,Einar Asplund,Workneh Yilma Ayele,Martin Duneld
2024-01-31
Abstract:Choosing and developing performant database solutions helps organizations optimize their operational practices and decision-making. Since graph data is becoming more common, it is crucial to develop and use them in big data with complex relationships with high and consistent performance. However, legacy database technologies such as MySQL are tailored to store relational databases and need to perform more complex queries to retrieve graph data. Previous research has dealt with performance aspects such as CPU and memory usage. In contrast, energy usage and temperature of the servers are lacking. Thus, this paper evaluates and compares state-of-the-art graphs and relational databases from the performance aspects to allow a more informed selection of technologies. Graph-based big data applications benefit from informed selection database technologies for data retrieval and analytics problems. The results show that Neo4j performs faster in querying connected data than MySQL and ArangoDB, and energy, CPU, and memory usage performances are reported in this paper.
Databases
What problem does this paper attempt to address?
This paper mainly discusses how to select and develop high-performance database solutions to optimize operational practices and decision-making when dealing with increasingly common graph data. In the study, the authors conducted experimental comparative analysis on the performance of three databases, ArangoDB, MySQL, and Neo4j, in querying connected data, with a focus on metrics such as query time, CPU usage, memory usage, energy consumption, and server temperature. The study found that Neo4j performs faster than MySQL and ArangoDB in querying connected data, while also reporting on energy, CPU, and memory usage. The paper points out that although previous research has involved CPU and memory usage, evaluation in terms of energy consumption and server temperature is insufficient. Therefore, this work aims to provide developers with a more informed technological choice by comparing the performance of the latest graph databases and relational databases.