Reverse-auction-based Crowdsourced Labeling for Active Learning.

Hai Tang,Mingjun Xiao,Guoju Gao,Hui Zhao
DOI: https://doi.org/10.1007/s11280-019-00744-3
2019-01-01
World Wide Web
Abstract:In the past few years, Machine Learning (ML) has aroused great interest in both academic and industrial societies. ML is booming because of its huge application potential in many areas, such as facial recognition, natural language processing, self-driving car, and so on. Nevertheless, one of the key problems is the scarcity of labeled data. Fortunately, mobile crowdsourcing makes it possible to recruit mobile workers to label large-scale data by offering them small payments. In this paper, we use crowdsourcing to tackle the scarcity of training data in active learning, and then propose an approximately truthful, individually rational, privacy-preserving incentive mechanism with a guaranteed approximate performance, based on the single-minded reverse auction for data labeling in active learning. Different from prior works, we take crowd workers' reliability into consideration when selecting data to be labeled which can improve the labeling quality and the model performance. In addition, we employ differential privacy to preserve workers' bid privacy because a worker's bid usually contains sensitive information. The simulation results demonstrate that the learning model is much accurate compared with the traditional active learning without the consideration of reliability in the case of the same number of iterations.
What problem does this paper attempt to address?