Crowdgame: A Game-Based Crowdsourcing System For Cost-Effective Data Labeling

Tongyu Liu,Jingru Yang,Ju Fan,Zhewei Wei,Guoliang Li,Xiaoyong Du
DOI: https://doi.org/10.1145/3299869.3320221
2019-01-01
Abstract:Large-scale data labeling has become a major bottleneck for many applications, such as machine learning and data integration. This paper presents CROWDGAME, a crowdsourcing system that harnesses the crowd to gather data labels in a cost-effective way. CROWDGAME focuses on generating high-quality labeling rules to largely reduce the labeling cost while preserving quality. It first generates candidate rules, and then devises a game-based crowdsourcing approach to select rules with high coverage and accuracy. CROWDGAME applies the generated rules for effective data labeling. We have implemented CROWDGAME and provided a user-friendly interface for users to deploy their labeling applications. We will demonstrate CROWDGAME in two representative data labeling scenarios, entity matching and relation extraction.
What problem does this paper attempt to address?