The Effect of Class Imbalance and Order on Crowdsourced Relevance Judgments

Rehab K. Qarout,Alessandro Checco,Gianluca Demartini
DOI: https://doi.org/10.48550/arXiv.1609.02171
2016-09-04
Information Retrieval
Abstract:In this paper we study the effect on crowd worker efficiency and effectiveness of the dominance of one class in the data they process. We aim at understanding if there is any positive or negative bias in workers seeing many negative examples in the identification of positive labels. To test our hypothesis, we design an experiment where crowd workers are asked to judge the relevance of documents presented in different orders. Our findings indicate that there is a significant improvement in the quality of relevance judgements when presenting relevant results before the non-relevant ones.
What problem does this paper attempt to address?