Abstract:Learning from crowds describes that the annotations of training data are obtained with crowd-sourcing services. Multiple annotators each complete their own small part of the annotations, where labeling mistakes that depend on annotators occur frequently. Modeling the label-noise generation process by the noise transition matrix is a power tool to tackle the label noise. In real-world crowd-sourcing scenarios, noise transition matrices are both annotator- and instance-dependent. However, due to the high complexity of annotator- and instance-dependent transition matrices (AIDTM), annotation sparsity, which means each annotator only labels a little part of instances, makes modeling AIDTM very challenging. Prior works simplify the problem by assuming the transition matrix is instance-independent or using simple parametric ways, which lose modeling generality. Motivated by this, we target a more realistic problem, estimating general AIDTM in practice. Without losing modeling generality, we parameterize AIDTM with deep neural networks. To alleviate the modeling challenge, we suppose every annotator shares its noise pattern with similar annotators, and estimate AIDTM via knowledge transfer. We hence first model the mixture of noise patterns by all annotators, and then transfer this modeling to individual annotators. Furthermore, considering that the transfer from the mixture of noise patterns to individuals may cause two annotators with highly different noise generations to perturb each other, we employ the knowledge transfer between identified neighboring annotators to calibrate the modeling. Theoretical analyses are derived to demonstrate that both the knowledge transfer from global to individuals and the knowledge transfer between neighboring individuals can help model general AIDTM. Experiments confirm the superiority of the proposed approach on synthetic and real-world crowd-sourcing data.

Learning from Crowds with Annotation Reliability

Learning from Crowds under Experts' Supervision

A Formalized Framework for Incorporating Expert Labels in Crowdsourcing Environment

Human-centred Design on Crowdsourcing Annotation Towards Improving Active Learning Model Performance

An Interactive Method to Improve Crowdsourced Annotations

Crowd-Certain: Label Aggregation in Crowdsourced and Ensemble Learning Classification

Efficient Online Crowdsourcing with Complex Annotations

Transferring Annotator- and Instance-dependent Transition Matrix for Learning from Crowds

Hierarchical Crowdsourcing for Data Labeling with Heterogeneous Crowd.

Labelling Training Samples Using Crowdsourcing Annotation for Recommendation

Active Learning For Crowdsourcing Using Knowledge Transfer

Rethinking Crowdsourcing Annotation: Partial Annotation with Salient Labels for Multi-Label Image Classification

Attention-Aware Answers of the Crowd

Multi-Label Crowdsourcing Learning With Incomplete Annotations

Crowdsourcing Label Quality: A Theoretical Analysis

Selective Verification Strategy for Learning from Crowds

An Online Learning Approach to Improving the Quality of Crowd-Sourcing

KD-Crowd: a knowledge distillation framework for learning from crowds

Improving Learning-from-Crowds Through Expert Validation.

Obtaining High-Quality Label by Distinguishing Between Easy and Hard Items in Crowdsourcing

Uncovering the Latent Structures of Crowd Labeling.