Approximation of Probabilistic Maximal Frequent Itemset Mining over Uncertain Sensed Data

Sheng Chen,Lihai Nie,Xiaoyi Tao,Zhiyang Li,Laiping Zhao
DOI: https://doi.org/10.1109/access.2020.2997409
IF: 3.9
2020-01-01
IEEE Access
Abstract:Event detection by discovering frequent itemsets is very popular in sensor network communities. However, the recorded data is often a probability rather than a determined value in a really productive environment as sensed data is often affected by noise. In this paper, we study to detect events by finding frequent patterns over probabilistic sensor data under the Possible World Semantics. This is technically challenging as probabilistic records can generate an exponential number of possible worlds. Although several efficient algorithms are proposed in the literature, it is still difficult to mine probabilistic maximal frequent items (PMFIs) in large uncertain database due to the high time complexity. To address this issue, we employ approximate idea to further reduce the time complexity from O(nlogn) to O(n) and propose a two-step solution (Aproximation Probabilistic Frequent Itemset-MAX, APFI-MAX in short) including PMFI candidates generation and PMFIs confirmation. We also provide the necessary proofs of our approximation method to make APFI-MAX more solid and convincing. Finally, extensive experiments have been conducted on synthetic and real databases, demonstrating that the proposed APFI-MAX always running faster than state-of-art methods under different parameter settings.
What problem does this paper attempt to address?