Approximating Global Optimum for Probabilistic Truth Discovery

Shi Li,Jinhui Xu,Minwei Ye
DOI: https://doi.org/10.1007/s00453-020-00715-5
2020-01-01
Abstract:The problem of truth discovery arises in many areas such as database, data mining, data crowdsourcing and machine learning. It seeks trustworthy information from possibly conflicting data provided by multiple sources. Due to its practical importance, the problem has been studied extensively in recent years. Two competing models were proposed for truth discovery, weight-based model and probabilistic model. While (1+ϵ ) -approximations have already been obtained for the weight-based model, no quality guaranteed solution has been discovered yet for the probabilistic model. In this paper, we focus on the probabilistic model and formulate it as a geometric optimization problem. Based on a sampling technique and a few other ideas, we achieve the first (1 + ϵ ) -approximation solution. Our techniques can also be used to solve the more general multi-truth discovery problem. We validate our method by conducting experiments on both synthetic and real-world datasets (teaching evaluation data) and comparing its performance to some existing approaches. Our solutions are closer to the truth as well as global optimum based on the experimental result. The general technique we developed has the potential to be used to solve other geometric optimization problems.
What problem does this paper attempt to address?