A New Similarity in Clustering Through Users' Interest and Social Relationship
Jianxiong Guo,Zhehao Zhu,Yucen Gao,Xiaofeng Gao
DOI: https://doi.org/10.1016/j.tcs.2024.114833
IF: 1.002
2024-01-01
Theoretical Computer Science
Abstract:Clustering is a basic technology in data mining, and similarity measurement plays a crucial role in it. The existing clustering algorithms, especially those for social networks, pay more attention to users' properties while ignoring the global measurement across social relationships. In this paper, a new clustering algorithm is proposed, which not only considers the distance of users' properties but also considers users' social influence. Social influence can be further divided into mutual influence and self influence. With mutual influence, we can deal with users' interests and measure their similarities by introducing areas and activities, thus better weighing the influence between them in an indirect way. Separately, we formulate a new propagation model, PR-Threshold++, by merging the PageRank algorithm and Linear Threshold model, to model the self influence. Based on that, we design a novel similarity by exploiting users' distance, mutual influence, and self influence. Finally, we adjust K-medoids according to our similarity and use real-world datasets to evaluate their performance in intensive simulations.