An Expert Validation Framework For Improving The Quality Of Crowdsourced Clustering

Liu Jiang,Zheng Qin,Zhipeng Li,Pengbo Shen,Shaohan Hu
DOI: https://doi.org/10.1007/978-3-030-36802-9_18
2019-01-01
Abstract:Crowdclustering is a cost-effective mechanism that learns a cluster structure from data and crowdsourced human pairwise labels. Though some initial efforts have shown some effectiveness of crowdclustering, performing a reliable crowdclustering is inherently challenging due to the noisy and uncertain nature of crowdsourced labels; the consistency of crowdclustering quality is also not guaranteed across datasets. To improve the quality of crowdsourced clustering, we argue for the need of expert validations for post-processing clustering results. To this end, we establish a novel expert validation framework comprised of a dynamic multi-criteria-based pair selection component to actively select most informative data pairs, and a pairwise label propagation component for enhancing the expert influence and incorporating them to guide the crowdclustering. Both components serve to minimize expert validation efforts. Experimental results on six real-world and synthetic datasets show the effectiveness of our overall approach and its key components, respectively.
What problem does this paper attempt to address?