Crowdsourcing with Self-paced Workers

Xiangping Kang,Guoxian Yu,Carlotta Domeniconi,Jun Wang,Wei Guo,Yazhou Ren,Lizhen Cui
DOI: https://doi.org/10.1109/icdm51629.2021.00038
2021-01-01
Abstract:Crowdsourcing is a popular and relatively economic way to harness human intelligence to process computer-hard tasks. Due to diverse factors (i.e., task difficulty, worker capability, and incentives), the collected answers from various crowd workers are of different quality. Many approaches have been proposed to manage high quality answers and to reduce the budget by modelling tasks, workers, or both. However, most of the existing approaches implicitly assume that the capability of workers is fixed during the crowdsourcing process. But in practice, such capability can be improved by gradually completing easy to hard tasks, alike human beings’ intrinsic self-paced learning ability. In this paper, we investigate crowdsourcing with self-paced workers, whose capability can be gradually boosted as he/she scrutinises and completes easy to hard tasks. Our proposed SPCrowd (Self-Paced Crowd worker) first asks workers to complete a set of golden tasks with known annotations; provides feedback to assist workers with capturing the raw modes of tasks and to spark the self-paced learning, which in turn facilitates the estimation of workers’ quality and tasks’ difficulty. It then introduces a task difficulty model to quantify the difficulty of tasks and rank them from easy to hard, and a benefit maximization criterion for task assignment, which can dynamically monitor the quality of self-paced workers and assign the sorted tasks to capable workers. In this way, a worker can successfully complete hard tasks after he/she completes easier and related tasks. Experimental results on semi-simulated and real crowdsourcing projects show that SPCrowd can better control the quality and save the budget compared to competitive baselines.
What problem does this paper attempt to address?