Homophily-aware correction framework for crowdsourced labels using heterogeneous information network

Qingren Wang,Jian Lu,Wei Li,Jing Zhang,Victor S. Sheng
DOI: https://doi.org/10.1016/j.eswa.2022.116896
IF: 8.5
2022-08-01
Expert Systems with Applications
Abstract:Crowdsourcing provides a cost-effective and convenient way for label collection, but it fails to guarantee the quality of crowdsourced labels. On the one hand, it is impossible to obtain accurate and detailed information of labelers that participate in tasks because of the anonymous nature of crowdsourcing. On the other hand, most of existing methods focus on characteristics of individuals while ignoring the explicit and implicit interactive information among individuals. Besides, existing homogeneous information network-based approaches cannot distinguish the heterogeneity among labelers as well as their corresponding relationships, which results in irreversible loss of interactive information among labelers. Enormous observations show that labelers often provide the same (different) answers for tasks belonging to the same category if they have highly similar (contrary) views. Therefore, in this paper, we first define this kind of similar (contrary) views over the same task category as homophily among labelers. And then we propose a novel Homophily-aware Correction Framework (HaCF) based on heterogeneous information network to model multiple explicit and implicit interactive relations among labelers, tasks, and categories. In addition, we propose a novel homophily-based label classifier to strengthen the impact of positive labels while reducing the influence of negative ones. Experimental results on seven real-world datasets not only show the effectiveness of our HaCF in terms of quality improvement of crowdsourced labels but also demonstrate the expandability in terms of collaborating with inference algorithms.
computer science, artificial intelligence,engineering, electrical & electronic,operations research & management science
What problem does this paper attempt to address?