Practical Reasoning in DatalogMTL

Dingmin Wang,Przemysław A. Wałęga,Pan Hu,Bernardo Cuenca Grau
2024-01-05
Abstract:DatalogMTL is an extension of Datalog with metric temporal operators that has found an increasing number of applications in recent years. Reasoning in DatalogMTL is, however, of high computational complexity, which makes reasoning in modern data-intensive applications challenging. In this paper we present a practical reasoning algorithm for the full DatalogMTL language, which we have implemented in a system called MeTeoR. Our approach effectively combines an optimised (but generally non-terminating) materialisation (a.k.a. forward chaining) procedure, which provides scalable behaviour, with an automata-based component that guarantees termination and completeness. To ensure favourable scalability of the materialisation component, we propose a novel seminaïve materialisation procedure for DatalogMTL enjoying the non-repetition property, which ensures that each specific rule application will be considered at most once throughout the entire execution of the algorithm. Moreover, our materialisation procedure is enhanced with additional optimisations which further reduce the number of redundant computations performed during materialisation by disregarding rules as soon as it is certain that they cannot derive new facts in subsequent materialisation steps. Our extensive evaluation supports the practicality of our approach.
Logic in Computer Science
What problem does this paper attempt to address?
The paper attempts to address the problem of efficient reasoning in **DatalogMTL** (an extension of the Datalog language with metric temporal operators). Specifically, DatalogMTL has very high computational complexity in modern data-intensive applications, making its practical application difficult. The paper proposes a practical reasoning algorithm aimed at addressing the following issues: 1. **High computational complexity**: The reasoning complexity of DatalogMTL is very high, specifically ExpSpace-complete (with respect to formula size) and PSpace-complete (with respect to data size). This high complexity makes reasoning in data-intensive applications very challenging. 2. **Lack of guarantees for termination and completeness**: Existing DatalogMTL reasoning systems either do not support recursive programs or cannot guarantee termination and completeness. For example, Brandt et al. (2018) implemented a prototype reasoner based on query rewriting, but it only applies to non-recursive DatalogMTL programs; the temporal extension of the Vadalog system, while implementing the full DatalogMTL language, does not guarantee termination. 3. **Lack of practical reasoning algorithms**: Although some low-complexity fragments of DatalogMTL and alternative semantics have been theoretically studied, designing and implementing efficient reasoning algorithms in practical applications remains an underexplored area. To address these issues, the paper proposes a practical algorithm that combines optimized forward chaining (i.e., materialization) with automata-based reasoning methods. The main contributions of the algorithm include: 1. **Optimized semi-naive materialization process**: By tracking newly generated facts in each materialization step and ensuring that each rule instance is considered at most once, redundant computations are reduced. 2. **Further optimized materialization process**: During execution, rules that are determined not to generate new facts are ignored as early as possible, further reducing redundant computations. 3. **Combination of materialization and automata-based reasoning algorithm**: Most of the computational tasks are delegated to the scalable materialization component, and automata-based techniques are used when necessary to ensure termination and completeness. The paper also validates the proposed algorithm's performance and scalability through experimental evaluations on multiple benchmarks. These experimental results demonstrate that the algorithm performs well in handling non-trivial programs and datasets containing millions of temporal facts. In most cases, materialization reasoning alone is sufficient to answer queries, with automata-based reasoning techniques needed only in rare instances.