Using SAT and SQL for Pattern Mining in Relational Databases

E. Coquery,Jean-Marc Petit,L. Sais
Abstract:In this paper, we present an ongoing work bridging the gap between pattern mining, SQL and SAT for a particular class of patterns. We extend the work presented in [2] that proposes a logical query language for rule patterns satisfying Armstrong’s axioms. Our contributions are the following: firstly, we allow a large part of the relational tuple calculus (SQL) to be used in the specification of queries. Secondly, we propose a boolean encoding of the query that can be used to compute answers even in the case of non Armstrongcompliant queries. Some experiments have been performed on top of Derby (embedded Java DBMS) and a modified version of MiniSat to show the feasibility of the approach.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is how to simultaneously explain the production rate of high - mass proton pairs (such as Drell - Yan pairs) produced at high center - of - mass energy and the phenomenon that protons in the nucleus still maintain high energy after undergoing multiple collisions in ultra - relativistic heavy - ion collisions, as well as the energy - loss mechanism of soft processes (such as meson production) during these collision processes. Specifically, experiments have observed that in high - energy proton - nucleus (pA) scattering, the production cross - section of Drell - Yan pairs exhibits a linear nuclear dependence, indicating that these pairs are mainly produced in the initial collision and that the protons have almost no energy loss during the process of passing through the nucleus. However, this phenomenon contradicts the results predicted by the standard pure strong - interaction two - body cascade model, which predicts that the energy of protons will gradually decrease as the number of collisions increases. In addition, experiments have also found that in large - scale lead - lead collisions, the particle spectra including a large number of mesons show significant energy loss, which is in contrast to the production situation of Drell - Yan pairs. To resolve this contradiction, the paper proposes a new two - stage cascade model. The first stage is a high - energy, fast cascade, which deals with the collisions between the initial nucleons and the hard partons that may be released, ignoring the energy loss caused by soft processes. The second stage is re - initialized on the basis of the first stage and conducts a low - energy, slow cascade to simulate the energy loss and meson production in soft processes. Through this separation in the time scale, the model can simultaneously describe the rapid production of Drell - Yan pairs and the subsequent energy loss in soft processes, thus providing a unified dynamic framework to explain these seemingly contradictory experimental results. This method not only solves the above - mentioned problems, but also reduces the dependence of the model on the choice of reference frame and improves the stability and accuracy of the model by re - initializing between the two stages.