Crowd & Prejudice: An Impossibility Theorem for Crowd Labelling without a Gold Standard

Nicolás Della Penna,Mark D. Reid
DOI: https://doi.org/10.48550/arXiv.1204.3511
2012-04-16
Abstract:A common use of crowd sourcing is to obtain labels for a dataset. Several algorithms have been proposed to identify uninformative members of the crowd so that their labels can be disregarded and the cost of paying them avoided. One common motivation of these algorithms is to try and do without any initial set of trusted labeled data. We analyse this class of algorithms as mechanisms in a game-theoretic setting to understand the incentives they create for workers. We find an impossibility result that without any ground truth, and when workers have access to commonly shared 'prejudices' upon which they agree but are not informative of true labels, there is always equilibria where all agents report the prejudice. A small amount amount of gold standard data is found to be sufficient to rule out these equilibria.
Social and Information Networks,Computer Science and Game Theory
What problem does this paper attempt to address?