A hybrid collaborative fi ltering model for social in fl uence prediction in event-based social networks ☆

Xiao Li,Xiang Cheng,Sen Su,Shuchen Li,Jianyu Yang
2018-01-01
Abstract:Event-based social networks (EBSNs) provide convenient online platforms for users to organize, attend and share social events. Understanding users’ social influences in social networks can benefit many applications, such as social recommendation and social marketing. In this paper, we focus on the problem of predicting users’ social influences on upcoming events in EBSNs. We formulate this prediction problem as the estimation of unobserved entries of the constructed user-event social influence matrix, where each entry represents the influence value of a user on an event. In particular, we define a user's social influence on a given event as the proportion of the user's friends who are influenced by him/her to attend the event. To solve this problem, we present a hybrid collaborative filtering model, namely, Matrix Factorization with Event-User Neighborhood (MF-EUN) model, by incorporating both event-based and user-based neighborhood methods into matrix factorization. Due to the fact that the constructed social influence matrix is very sparse and the overlap values in the matrix are few, it is challenging to find reliable similar neighbors using the widely adopted similarity measures (e.g., Pearson correlation and Cosine similarity). To address this challenge, we propose an additional information based neighborhood discovery (AID) method by considering both event-specific and user-specific features in EBSNs. The parameters of our MF-EUN model are determined by minimizing the associated regularized squared error function through stochastic gradient descent. We conduct a comprehensive performance evaluation on real-world datasets collected from DoubanEvent. Experimental results show that our proposed hybrid collaborative filtering model is superior than several alternatives, which provides excellent performance with RMSE and MAE reaching 0.248 and 0.1266 respectively in the 90% training data of 10 000
What problem does this paper attempt to address?