Leveraging Attributes And Crowdsourcing For Join

Jianhong Feng,Jianhua Feng,Huiqi Hu
DOI: https://doi.org/10.1007/978-3-319-08010-9_47
2014-01-01
Abstract:Join operation is usually hard to achieve high quality with machine alone. We adopt crowdsourcing to improve the quality of join. Depending on the number of generated pairs, the overall cost can be expensive for hiring workers to do the verification. We propose a hybrid approach to generate pairs by leveraging attributes, which combines category, sorting and clustering techniques, called CSCER. We also propose an adaptive attribute-selection strategy to efficiently generate pairs based on attributes. Experiments on a real crowdsourcing platform using real datasets indicate that our approaches save the overall cost compared to existing methods and achieve high quality of join results.
What problem does this paper attempt to address?