DENA: display name embedding method for Chinese social network alignment

Yao Li,Huilin Liu
DOI: https://doi.org/10.1007/s00521-022-08014-6
2022-12-27
Neural Computing and Applications
Abstract:Social network alignment, which aims at finding node correspondences between social networks, is the cornerstone of fusing big data from different social networks. Most of social network alignment solutions are based on English environment. Hence, the existing attribute-based solutions, which contain the unique features in English, are not suitable for Chinese social networks. Although structure-based methods are general, they suffer from the sparsity problem. To solve the Chinese social network alignment problem, in this paper, a novel display name embedding method is proposed, called DENA. It utilizes the morphological and phonetic information of Chinese characters to enhance the alignment accuracy. Specifically, in DENA, a hierarchical n-gram process framework is introduced to generate features from display names and their related morphological information (i.e., strokes) and phonetic information (i.e., pinyin). Then, an innovative graph called display name graph is proposed to transform them into an undirected and unweighted graph. By learning this graph, all features are embedded in to low-dimensional vectors. Therefore, the closeness between embedding vectors of display names represents the probability of the alignment between them. Experiments based on real-world datasets show that DENA outperforms traditional classification-based methods and the state-of-the-art word embedding methods in social network alignment.
computer science, artificial intelligence
What problem does this paper attempt to address?