Online Social Network Profile Linkage

Haochen Zhang,Min-Yen Kan,Yiqun Liu,Shaoping Ma
DOI: https://doi.org/10.1007/978-3-319-12844-3_17
2014-01-01
Abstract:Piecing together social signals from people in different online social networks is key for downstream analytics. However, users may have different usernames in different social networks, making the linkage task difficult. To enable this, we explore a probabilistic approach that uses a domain-specific prior knowledge to address this problem of online social network user profile linkage. At scale, linkage approaches that are based on a naive pairwise comparisons that have quadratic complexity become prohibitively expensive. Our proposed threshold-based canopying framework - named OPL - reduces this pairwise comparisons, and guarantees a upper bound theoretic linear complexity with respect to the dataset size. We evaluate our approaches on real-world, large-scale datasets obtained from Twitter and Linkedin. Our probabilistic classifier integrating prior knowledge into Naive Bayes performs at over 85% F-1-measure for pairwise linkage, comparable to state-of-the-art approaches.
What problem does this paper attempt to address?