A data-intensive approach for discovering user similarities in social behavioral interactions based on the bayesian network.

Kun Yue,Hao Wu,Xiaodong Fu,Juan Xu,Zidu Yin,Weiyi Liu
DOI: https://doi.org/10.1016/j.neucom.2016.09.042
IF: 6
2017-01-01
Neurocomputing
Abstract:Discovering user similarities from social media can establish the basis for user targeting, product recommendation, user relationship evolution and understanding. User similarities not only depend on the topological structure but also the dependence degrees between users. In this paper, we adopt Bayesian network (BN), an important and popular probabilistic graphical model, as the underling framework and propose a data-intensive approach for discovering user similarities. First, upon the massive social behavioral interactions, we give the method for measuring direct similarities between users and the MapReduce-based algorithm for constructing a BN to describe these similarities, called user Bayesian network and abbreviated as UBN. We also give the idea for storing large-scale UBNs in a distributed file system. Then, to measure indirect similarities between users, we give the method for measuring the closeness of user connections in terms of the properties of UBN's graphical structure. Further, we give the MapReduce-based algorithm for measuring the dependence degrees by means of UBN's probabilistic inferences. By combining the above two perspectives of measures, the indirect similarity degree between users can be achieved, while guaranteeing the applicability theoretically. Finally, we give experimental results and show the efficiency and effectiveness of our method.
What problem does this paper attempt to address?