Abstract:Crowdsourcing is a useful and economic approach to data annotation. To obtain annotation of high quality, various aggregation approaches have been developed, which take into account different factors that impact the quality of aggregated answers. However, existing methods generally focus on single-label (multi-class and binary) tasks, and they ignore the inter-correlation between labels, and thus may have compromised quality. In this paper, we introduce a Multi-Label answer aggregation approach based on Joint Matrix Factorization (ML-JMF). ML-JMF selectively and jointly factorizes the sample-label association matrices collected from different annotators into products of individual and shared low-rank matrices. As such, it takes advantage of the robustness of low-rank matrix approximation to noise, and reduces the impact of unreliable annotators by assigning small (zero) weights to their annotation matrices. In addition, it takes advantage of the correlation among labels by leveraging the shared low-rank matrix, and of the similarity between annotators using the individual low-rank matrices to guide the factorization. ML-JMF pursues the low-rank matrices via a unified objective function, and introduces an iterative technique to optimize it. ML-JMF finally uses the optimized low-rank matrices and weights to infer the ground-truth labels. Our experimental results on multi-label datasets show that ML-JMF outperforms competitive methods in inferring ground truth labels. Our approach can identify unreliable annotators, and is robust against their misleading answers through the assignment of small (zero) weights to their annotation.

Multi-Label Truth Inference for Crowdsourcing Using Mixture Models.

Multi-Label Inference for Crowdsourcing

Mixture of Experts based Multi-task Supervise Learning from Crowds

A Formalized Framework for Incorporating Expert Labels in Crowdsourcing Environment

Learning from Crowds under Experts' Supervision

Multi-Label Crowdsourcing Learning With Incomplete Annotations

Unbiased Multi-Label Learning from Crowdsourced Annotations

Multi-label Crowd Consensus Via Joint Matrix Factorization

Crowdsourcing Truth Inference Based on Label Confidence Clustering

Active Crowdsourcing for Multilabel Annotation.

Modeling for Noisy Labels of Crowd Workers.

Multi-Factor Influencing Truth Inference in Crowdsourcing.

Multi-label Answer Aggregation Based on Joint Matrix Factorization

Multi-Class Ground Truth Inference in Crowdsourcing with Clustering

Label Consistency-Based Ground Truth Inference for Crowdsourcing

Collusion Detection and Ground Truth Inference in Crowdsourcing for Labeling Tasks.

Semi-Supervised Multi-Label Learning from Crowds via Deep Sequential Generative Model

Learning from Multi-User Multi-Attribute Annotations.

Improving the Quality of Crowdsourcing Labels by Combination of Golden Data and Incentive

A robust inference algorithm for crowd sourced categorization