Abstract:In general, a large amount of labels are needed for supervised learning algorithms to achieve satisfactory performance. It's typically very time-consuming and money-consuming to get such kind of labeled data. Recently, crowdsourcing services provide an effective way to collect labeled data with much lower cost. Hence, crowdsourced learning (CL), which performs learning with labeled data collected from crowdsourcing services, has become a very hot and interesting research topic in recent years. Most existing CL methods exploit only the labels from different workers (annotators) for learning while ignoring the attributes of the instances. In many real applications, the attributes of the instances are actually the most discriminative information for learning. Hence, CL methods with attributes have attracted more and more attention from CL researchers. One representative model of such kind is the personal classifier (PC) model, which has achieved the state-of-the-art performance. However, the PC model makes an unreasonable assumption that all the workers contribute equally to the final classification. This contradicts the fact that different workers have different quality (ability) for data labeling. In this paper, we propose a novel model, called robust personal classifier (RPC), for robust crowdsourced learning. Our model can automatically learn an expertise score for each worker. This expertise score reflects the inherent quality of each worker. The final classifier of our RPC model gives high weights for good workers and low weights for poor workers or spammers, which is more reasonable than PC model with equal weights for all workers. Furthermore, the learned expertise score can be used to eliminate spammers or low-quality workers. Experiments on simulated datasets and UCI datasets show that the proposed model can dramatically outperform the baseline models such as PC model in terms of classification accuracy and ability to detect spammers.

Robust Crowdsourced Learning

Learning from Crowds under Experts' Supervision

A Formalized Framework for Incorporating Expert Labels in Crowdsourcing Environment

KD-Crowd: a knowledge distillation framework for learning from crowds

Learning from Crowds with Annotation Reliability

NeuCrowd: Neural Sampling Network for Representation Learning with Crowdsourced Labels

Crowd-Certain: Label Aggregation in Crowdsourced and Ensemble Learning Classification

Reverse-auction-based Crowdsourced Labeling for Active Learning.

Deep Active Learning with Crowdsourcing Data for Privacy Policy Classification

An Online Learning Approach to Improving the Quality of Crowd-Sourcing

Improving Learning-from-Crowds Through Expert Validation.

PrefCLM: Enhancing Preference-based Reinforcement Learning with Crowdsourced Large Language Models

Obtaining High-Quality Label by Distinguishing Between Easy and Hard Items in Crowdsourcing

Collusion Detection and Ground Truth Inference in Crowdsourcing for Labeling Tasks.

Multi-Label Crowdsourcing Learning With Incomplete Annotations

Efficiency of active learning for the allocation of workers on crowdsourced classification tasks

Learning from Crowds in the Presence of Schools of Thought.

Unbiased Multi-Label Learning from Crowdsourced Annotations

Active Learning For Crowdsourcing Using Knowledge Transfer

Self-Taught Active Learning from Crowds

Deep Robust Subjective Visual Property Prediction in Crowdsourcing