Independence in Infinite Probabilistic Databases

Martin Grohe,Peter Lindner
DOI: https://doi.org/10.1145/3549525
IF: 2.269
2022-10-27
Journal of the ACM
Abstract:Probabilistic databases (PDBs) model uncertainty in data. The current standard is to view PDBs as finite probability spaces over relational database instances. Since many attributes in typical databases have infinite domains, such as integers, strings, or real numbers, it is often more natural to view PDBs as infinite probability spaces over database instances. In this article, we lay the mathematical foundations of infinite probabilistic databases. Our focus then is on independence assumptions. Tuple-independent PDBs play a central role in theory and practice of PDBs. Here we study infinite tuple-independent PDBs as well as related models such as infinite block-independent disjoint PDBs. While the standard model of PDBs focuses on a set-based semantics, we also study tuple-independent PDBs with a bag semantics and independence in PDBs over uncountable fact spaces.
computer science, information systems, theory & methods, software engineering, hardware & architecture
What problem does this paper attempt to address?