Making problems tractable on big data via preprocessing with polylog-size output

Jiannan Yang,Hanpin Wang,Yongzhi Cao
DOI: https://doi.org/10.48550/arXiv.1510.00229
2015-10-01
Abstract:To provide a dichotomy between those queries that can be made feasible on big data after appropriate preprocessing and those for which preprocessing does not help, Fan et al. developed the $\sqcap$-tractability theory. This theory provides a formal foundation for understanding the tractability of query classes in the context of big data. Along this line, we introduce a novel notion of $\sqcap'$-tractability in this paper. Inspired by some technologies used to deal big data, we place a restriction on preprocessing function, which limits the function to produce a relatively small database as output, at most polylog-size of the input database. At the same time, we bound the redundancy information when re-factorizing data and queries for preprocessing. These changes aim to make our theory more closely linked to practice. We set two complexity classes to denote the classes of Boolean queries that are $\sqcap'$-tractable themselves and that can be made $\sqcap'$-tractable, respectively. Based on a new factorization in our complexity classes, we investigate two reductions, which differ from whether allowing re-factorizing data and query parts. We verify the transitive and compatible properties of the reductions and analysis the complete problems and sizes of the complexity classes. We conclude that all PTIME classes of Boolean queries can be made $\sqcap'$-tractable, similar to that of the $\sqcap$-tractability theory. With a little surprise, we prove that the set of all $\sqcap'$-tractable queries is strictly smaller than that of all $\sqcap$-tractable queries, and thus the set of $\sqcap'$-tractable queries is properly contained in that of PTIME queries. In this way, we attain a new complexity class inside the complexity class of PTIME queries.
Computational Complexity
What problem does this paper attempt to address?