Diversity, Topology, and the Risk of Node Re-identification in Labeled Social Graphs

Sameera Horawalavithana,Clayton Gandy,Juan Arroyo Flores,John Skvoretz,Adriana Iamnitchi
DOI: https://doi.org/10.48550/arXiv.1808.10837
2018-08-31
Social and Information Networks
Abstract:Real network datasets provide significant benefits for understanding phenomena such as information diffusion or network evolution. Yet the privacy risks raised from sharing real graph datasets, even when stripped of user identity information, are significant. When nodes have associated attributes, the privacy risks increase. In this paper we quantitatively study the impact of binary node attributes on node privacy by employing machine-learning-based re-identification attacks and exploring the interplay between graph topology and attribute placement. Our experiments show that the population's diversity on the binary attribute consistently degrades anonymity.
What problem does this paper attempt to address?