Budget-Constrained Ego Network Extraction with Maximized Willingness

Bay-Yuan Hsu,Chia-Hsun Lu,Ming-Yi Chang,Chih-Ying Tseng,Chih-Ya Shen
DOI: https://doi.org/10.1109/tkde.2024.3446169
IF: 9.235
2024-01-01
IEEE Transactions on Knowledge and Data Engineering
Abstract:Many large-scale machine learning approaches and graph algorithms are proposed recently to address a variety of problems in online social networks (OSNs). To evaluate and validate these algorithms and models, the data of ego-centric networks (ego networks) are widely adopted. Therefore, effectively extracting large-scale ego networks from OSNs becomes an important issue, particularly when privacy policies become increasingly strict nowadays. In this paper, we study the problem of extracting ego network data by considering jointly the user willingness, crawling cost, and structure of the network. We formulate a new research problem, named Structure and Willingness Aware Ego Network Extraction (SWAN) and analyze its NP-hardness. We first propose a (1 − 1 e)-approximation algorithm, named Tristar-Optimized Ego Network Identification with Maximum Willingness (TOMW). In addition to the deterministic approximation algorithm, we also propose to automatically learn an effective heuristic approach with machine learning, to avoid the huge efforts for human to devise a good algorithm. The learning approach is named Willingness-maximized and Structure-aware Ego Network Extraction with Reinforcement Learning (WSRL), in which we propose a novel constrastive learning strategy, named Contrastive Learning with Performance-boosting Graph Augmentation. We recruited 1,810 real-world participants and conducted an evaluation study to validate our problem formulation and proposed approaches. Moreover, experimental results on real social network datasets show that the proposed approaches outperform the other baselines significantly
What problem does this paper attempt to address?