Multi-granularity Attribute Similarity Model for User Alignment Across Social Platforms under Pre-Aligned Data Sparsity

Yongqiang Peng,Xiaoliang Chen,Duoqian Miao,Xiaolin Qin,Xu Gu
DOI: https://doi.org/10.1016/j.ipm.2024.103866
2024-01-01
Abstract:Cross-platform User Alignment (UA) aims to identify accounts belonging to the same individual across multiple social network platforms. This study seeks to enhance the performance of UA tasks while reducing the required sample data. Previous research has focused excessively on model design, lacking optimization throughout the entire process, making it challenging to achieve performance without heavy reliance on labeled data. This paper proposes a semi-supervised Multi-Granularity Attribute Similarity Model (MGASM). First, MGASM optimizes the embedding process through multi-granularity modeling at the levels of characters, words, articles, structures, and labels, and enhances missing data by leveraging adjacent text attributes. Next, MGASM quantifies the correlation between attributes of the same granularity by constructing Multi-Granularity Attribute Cosine Distance Distribution Vectors (MA-CDDVs). These vectors form the basis for a binary classification similarity model trained to calculate similarity scores for user pairs. Additionally, an attribute reappearance score correction (ARSC) mechanism is introduced to further refine the ranking of candidate users. Extensive experiments on the Weibo-Douban and DBLP17-DBLP19 datasets demonstrate that compared to state-of-the-art methods, The hit-precision of the MGASM series has significantly improved by 68.15% and 27.02%, almost reaching 100% precision. The F1 score has increased by 37.6% and 21.4%.
What problem does this paper attempt to address?