Probabilistic model for truth discovery with mean and median check framework

Songtao Ye,Junjie Wang,Hongjie Fan,Zhiqiang Zhang
DOI: https://doi.org/10.1016/j.knosys.2021.107482
2021-12-01
Abstract:In the era of big data, information can be collected from various sources. Unfortunately, information provided by multiple sources on the same entity is inevitably conflicting. Due to the ubiquitous existence of data conflicts, truth discovery has recently attracted considerable attention. Several truth discovery methods focus on providing a point estimate for the truth of each entity and exhibit completely different performances on the same input dataset. Therefore, an appropriate truth discovery method should be adopted to fit the unknown source reliability distributions. To address this, we approach truth discovery from another perspective. We theoretically verify that if the absolute distance between the mean and median value is large, then there must be incorrect claims with large errors in the input dataset. Accordingly, we propose a mean and median check (MMC) framework for truth detection, error claim removal, and iteration-stopping criteria. The experiments demonstrate that MMC can effectively remove incorrect claims provided by unreliable sources. Furthermore, the performance of state-of-the-art truth discovery methods can be significantly improved if MMC is used for input data preprocessing.
computer science, artificial intelligence
What problem does this paper attempt to address?