Abstract:User identity linkage across social networks is an essential problem for cross-network data mining. Since network structure, profile and content information describe different aspects of users, it is critical to learn effective user representations that integrate heterogeneous information. This paper proposes a novel framework with INformation FUsion and Neighborhood Enhancement (INFUNE) for user identity linkage. The information fusion component adopts a group of encoders and decoders to fuse heterogeneous information and generate discriminative node embeddings for preliminary matching. Then, these embeddings are fed to the neighborhood enhancement component, a novel graph neural network, to produce adaptive neighborhood embeddings that reflect the overlapping degree of neighborhoods of varying candidate user pairs. The importance of node embeddings and neighborhood embeddings are weighted for final prediction. The proposed method is evaluated on real-world social network data. The experimental results show that INFUNE significantly outperforms existing state-of-the-art methods.

What problem does this paper attempt to address?

The problem that this paper attempts to solve is user identity linkage between different social networks. Specifically, since users may have different accounts in multiple social networks, identifying the associations between these accounts is crucial for cross - network data mining. However, the correspondence of user accounts between different social networks is usually unavailable, which makes user identity linkage an important research topic. ### Main contributions of the paper 1. **Information fusion component**: An information fusion component is proposed, which can simultaneously fuse user structure, profile and content information. This is the first time to achieve the simultaneous fusion of these three types of information in an embedding - based method. 2. **Neighborhood enhancement component**: In order to utilize the information of potentially matching neighbors, a new graph neural network model is proposed to learn the neighborhood representation that varies with candidate user pairs. 3. **Experimental verification**: The effectiveness of INFUNE has been verified through extensive experiments, and the results show that INFUNE significantly outperforms the existing state - of - the - art methods. ### Method overview The INFUNE framework contains two main components: - **Information fusion component**: A set of encoders and decoders are used to fuse heterogeneous information and generate node embeddings for preliminary matching. - **Neighborhood enhancement component**: Based on the node embeddings, the potentially matching neighbors of candidate user pairs are identified, and the neighborhood embeddings that are dynamically adapted are learned through a new graph neural network model. ### Formulas and technical details 1. **Feature embedding**: \[ z_\alpha=\text{ENC}_\alpha(x) = W_{\alpha 2}\tanh(W_{\alpha 1}x + b_{\alpha 1})+b_{\alpha 2} \] where \(z_\alpha\) is the feature embedding and \(\text{ENC}_\alpha\) is the feature - specific encoder. 2. **Similarity metric**: \[ g_{ij}^\alpha=\text{sim}_\alpha(u_i, u_j) \] where \(g_{ij}^\alpha\) is the true similarity between users \(u_i\) and \(u_j\) on feature \(\alpha\). 3. **Reconstructed similarity**: \[ r_{ij}^\alpha=\text{DEC}_\alpha(z_i^\alpha, z_j^\alpha) \] where \(r_{ij}^\alpha\) is the reconstructed similarity and \(\text{DEC}_\alpha\) is the feature - specific decoder. 4. **Loss function**: \[ L_\alpha=\frac{1}{N_1N_2}\sum_{u_i\in U_1}\sum_{u_j\in U_2}\ell_\alpha(r_{ij}^\alpha, g_{ij}^\alpha) \] where \(\ell_\alpha\) is the squared - loss function. 5. **Total objective function**: \[ L_{\text{all}}=L_{\text{label}}+\sum_{\alpha\in\{s, p, c\}}L_\alpha \] 6. **Neighborhood enhancement component**: - **Potentially matching neighbors**: \[ N_i^+=\{u_n\in N_i\mid\text{Potentially matching}\} \] - **Neighborhood embedding**: \[ h_i^+=\text{GCN}(N_i^+)=\frac{1}{|N_i^+|}\sum_{u_n\in N_i^+}z_n \] - **Total neighborhood**

A Novel Framework with Information Fusion and Neighborhood Enhancement for User Identity Linkage

FEUI: Fusion Embedding for User Identification Across Social Networks

User Identity Linkage Across Social Networks Via Linked Heterogeneous Network Embedding

Cross-Network Social User Embedding with Hybrid Differential Privacy Guarantees

Retrofitting Embeddings for Unsupervised User Identity Linkage

MFLink: User Identity Linkage Across Online Social Networks Via Multimodal Fusion and Adversarial Learning

DSANE:A Dual Structure-Aware Network Embedding Approach for User Identity Linkage

Embedding Based Cross-network User Identity Association Technology

DeepLink: A Deep Learning Approach for User Identity Linkage

User Identity Linkage Across Social Networks by Heterogeneous Graph Attention Network Modeling

TransLink: User Identity Linkage across Heterogeneous Social Networks via Translating Embeddings

CoLink: an Unsupervised Framework for User Identity Linkage

Unsupervised User Identity Linkage Via Graph Neural Networks

User Identity Linkage Via Graph Convolutional Network Across Location-Based Social Networks.

A Practical Approach to Construct Profile Linkage Framework

PsyLink: User Identity Linkage via Psychological Characteristic Modeling

HFUL: a Hybrid Framework for User Account Linkage Across Location-Aware Social Networks

User Identity Linkage by Latent User Space Modelling.

Strengthening Social Networks Analysis by Networks Fusion.

A Unified Framework For Link Prediction Based On Non-Negative Matrix Factorization With Coupling Multivariate Information

DualLink: Dual Domain Adaptation for User Identity Linkage Across Social Networks