Testing Graph Database Systems via Equivalent Query Rewriting
Pinjia He,Qiuyang Mang,Hanfei Chen,Aoyang Fang,Boxi Yu
DOI: https://doi.org/10.1145/3597503.3639200
2024-04-12
Abstract:Graph Database Management Systems (GDBMS), which utilize graph models for data storage and execute queries via graph tra-versals, have seen ubiquitous usage in real-world scenarios such as recommendation systems, knowledge graphs, and social networks. Much like Relational Database Management Systems (RDBMS), GDBMS are not immune to bugs. These bugs typically manifest as logic errors that yield incorrect results (e.g., omitting a node that should be included), performance bugs (e.g., long execution time caused by redundant graph scanning), and exception issues (e.g., unexpected or missing exceptions). This paper adapts Equivalent Query Rewriting (EQR) to GDBMS testing. EQR rewrites a GDBMS query into equivalent ones that trigger distinct query plans, and checks whether they exhibit discrepancies in system behaviors. To facilitate the realization of EQR, we propose a general concept called Abstract Syntax Graph (ASG). Its core idea is to embed the semantics of a base query into the paths of a graph, which can be utilized to generate new queries with customized properties (e.g., equivalence). Given a base query, an ASG is constructed and then an equivalent query can be gener-ated by finding paths collectively carrying the complete semantics of the base query. To this end, we further design Random Walk Covering (RWC), a simple yet effective path covering algorithm. As a practical implementation of these ideas, we develop a tool GRev, which has successfully detected 22 previously unknown bugs across 5 popular GDBMS, with 15 of them being confirmed. In particular, 14 of the detected bugs are related to improper implementation of graph data retrieval in GDBMS, which is challenging to identify for existing techniques.
Computer Science