Abstract:Microtask crowdsourcing is a form of crowdsourcing in which work is decomposed into a set of small, self-contained tasks, which each can typically be completed in a matter of minutes. Due to the various capabilities and knowledge background of the voluntary participants on the Internet, the answers collected from the crowd are ambiguous and the final answer aggregation is challenging. In this process, the choice of quality control strategies is important for ensuring the quality of the crowdsourcing results. Previous work on answer estimation mainly used expectation–maximization (EM) approach. Unfortunately, EM provides local optimal solutions and the estimated results will be affected by the initial value. In this paper, we extend the local optimal result of EM and propose an approximate global optimal algorithm for answer aggregation of crowdsourcing microtasks with binary answers. Our algorithm is expected to improve the accuracy of real answer estimation through further likelihood maximization. First, three worker quality evaluation models are presented based on static and dynamic methods, respectively, and the local optimal results are obtained based on the maximum likelihood estimation method. Then, a dominance ordering model (DOM) is proposed according to the known worker responses and worker categories for the specified crowdsourcing task to reduce the space of potential task-response sequence while retaining the dominant sequence. Subsequently, a Cut-point neighbor detection algorithm is designed to iteratively search for the approximate global optimal estimation in a reduced space, which works on the proposed dominance ordering model (DOM). We conduct extensive experiments on both simulated and real-world datasets, and the experimental results illustrate that the proposed approach can obtain better estimation results and has higher performance than regular EM-based algorithms.

Attention-Aware Answers of the Crowd

Learning from Crowds under Experts' Supervision

Exploiting predicted answer in label aggregation to make better use of the crowd wisdom

Adaptive Crowdsourcing Via Self-Supervised Learning

Self-paced annotations of crowd workers

Cleaning Uncertain Data with Crowdsourcing - a General Model with Diverse Accuracy Rates

Crowdsourcing Label Quality: A Theoretical Analysis

Achieving Approximate Global Optimization of Truth Inference for Crowdsourcing Microtasks

Efficient Online Crowdsourcing with Complex Annotations

OnTac: Online Task Assignment for Crowdsourcing

Answer Inference for Crowdsourcing Based Scoring

Uncovering the Latent Structures of Crowd Labeling.

Learning from Crowds with Annotation Reliability

Active learning with confidence-based answers for crowdsourcing labeling tasks.

A Subjectivity-Aware Algorithm for Label Aggregation in Crowdsourcing

Hierarchical Crowdsourcing for Data Labeling with Heterogeneous Crowd.

Collusion Detection and Ground Truth Inference in Crowdsourcing for Labeling Tasks.

Globally Optimal Crowdsourcing Quality Management

Learning from Crowds in the Presence of Schools of Thought.

Quality-Assured Synchronized Task Assignment in Crowdsourcing

Treating Crowdsourcing as Examination: How to Score Tasks and Online Workers?