Minimizing Entropy for Crowdsourcing with Combinatorial Multi-Armed Bandit

Yiwen Song,Haiming Jin
DOI: https://doi.org/10.1109/INFOCOM42981.2021.9488800
2021-01-01
Abstract:Nowadays, crowdsourcing has become an increasingly popular paradigm for large-scale data collection, annotation, and classification. Today's rapid growth of crowdsourcing platforms calls for effective worker selection mechanisms, which oftentimes have to operate with a priori unknown worker reliability. We discover that the empirical entropy of workers' results, which measures the uncertainty in the final aggregated results, naturally becomes a suitable metric to evaluate the outcome of crowdsourcing tasks. Therefore, this paper designs a worker selection mechanism that minimizes the empirical entropy of the results submitted by participating workers. Specifically, we formulate worker selection under sequentially arriving tasks as a combinatorial multi-armed bandit problem, which treats each worker as an arm, and aims at learning the best combination of arms that minimize the cumulative empirical entropy. By information theoretic methods, we carefully derive an estimation of the upper confidence bound for empirical entropy minimization, and leverage it in our minimum entropy upper confidence bound (ME-UCB) algorithm to balance exploration and exploitation. Theoretically, we prove that ME-UCB has a regret upper bound of O(1), which surpasses existing submodular UCB algorithms. Our extensive experiments with both a synthetic and real-world dataset empirically demonstrate that our ME-UCB algorithm outperforms other state-of-the-art approaches.
What problem does this paper attempt to address?