Abstract:In general, a large amount of labels are needed for supervised learning algorithms to achieve satisfactory performance. It's typically very time-consuming and money-consuming to get such kind of labeled data. Recently, crowdsourcing services provide an effective way to collect labeled data with much lower cost. Hence, crowdsourced learning (CL), which performs learning with labeled data collected from crowdsourcing services, has become a very hot and interesting research topic in recent years. Most existing CL methods exploit only the labels from different workers (annotators) for learning while ignoring the attributes of the instances. In many real applications, the attributes of the instances are actually the most discriminative information for learning. Hence, CL methods with attributes have attracted more and more attention from CL researchers. One representative model of such kind is the personal classifier (PC) model, which has achieved the state-of-the-art performance. However, the PC model makes an unreasonable assumption that all the workers contribute equally to the final classification. This contradicts the fact that different workers have different quality (ability) for data labeling. In this paper, we propose a novel model, called robust personal classifier (RPC), for robust crowdsourced learning. Our model can automatically learn an expertise score for each worker. This expertise score reflects the inherent quality of each worker. The final classifier of our RPC model gives high weights for good workers and low weights for poor workers or spammers, which is more reasonable than PC model with equal weights for all workers. Furthermore, the learned expertise score can be used to eliminate spammers or low-quality workers. Experiments on simulated datasets and UCI datasets show that the proposed model can dramatically outperform the baseline models such as PC model in terms of classification accuracy and ability to detect spammers.

CrowdTC: Crowd-powered Learning for Text Classification

CrowdTSC: Crowd-based Neural Networks for Text Sentiment Classification

Learning from Crowds under Experts' Supervision

Crowd-Powered Data Mining

Machine learning based software effort estimation using development-centric features for crowdsourcing platform

Crowdsourcing with Multiple-Source Knowledge Transfer

Understanding the Impact of Text Highlighting in Crowdsourcing Tasks

Leveraging Neural Network-Based Model for Context Classification of Classroom Dialogue Text

Text Classification of Mixed Model Based on Deep Learning

A Transfer Learning Based Framework Of Crowd-Selection On Twitter

Integrating Crowdsourcing and Active Learning for Classification of Work-Life Events from Tweets

Classifying Crowdsourced Mobile Test Reports with Image Features: an Empirical Study

A Novel Method Using Local Feature to Enhance GCN for Text Classification

A text classification network model combining machine learning and deep learning

Deep Short Text Classification with Knowledge Powered Attention

Robust Crowdsourced Learning

Learning from Crowds Using Graph Neural Networks with Attention Mechanism

NeuCrowd: Neural Sampling Network for Representation Learning with Crowdsourced Labels

Description Based Text Classification with Reinforcement Learning

A Tensor Space Model-Based Deep Neural Network for Text Classification

Chinese Text Classification Model Based on Deep Learning