Finding Cross-rule Optimization Bugs in Datalog Engines

Chi Zhang,Linzhang Wang,Manuel Rigger
DOI: https://doi.org/10.1145/3649815
2024-01-01
Abstract:Datalog is a popular and widely-used declarative logic programming language.Datalog engines apply many cross-rule optimizations; bugs in them can causeincorrect results. To detect such optimization bugs, we propose an automatedtesting approach called Incremental Rule Evaluation (IRE), whichsynergistically tackles the test oracle and test case generation problem. Thecore idea behind the test oracle is to compare the results of an optimizedprogram and a program without cross-rule optimization; any difference indicatesa bug in the Datalog engine. Our core insight is that, for an optimized,incrementally-generated Datalog program, we can evaluate all rules individuallyby constructing a reference program to disable the optimizations that areperformed among multiple rules. Incrementally generating test cases not onlyallows us to apply the test oracle for every new rule generated-we also canensure that every newly added rule generates a non-empty result with a givenprobability and eschew recomputing already-known facts. We implemented IRE as atool named Deopt, and evaluated Deopt on four mature Datalog engines, namelySoufflé, CozoDB, μZ, and DDlog, and discovered a total of 30 bugs. Ofthese, 13 were logic bugs, while the remaining were crash and error bugs. Deoptcan detect all bugs found by queryFuzz, a state-of-the-art approach. Out of thebugs identified by Deopt, queryFuzz might be unable to detect 5. Ourincremental test case generation approach is efficient; for example, for testcases containing 60 rules, our incremental approach can produce 1.17×(for DDlog) to 31.02× (for Soufflé) as many valid test cases withnon-empty results as the naive random method. We believe that the simplicityand the generality of the approach will lead to its wide adoption in practice.
What problem does this paper attempt to address?