New methods to generate massive synthetic networks

Malay Chakrabarti,Lenwood Heath,Naren Ramakrishnan
DOI: https://doi.org/10.48550/arXiv.1705.08473
2017-05-24
Abstract:One of the biggest needs in network science research is access to large realistic datasets. As data analytics methods permeate a range of diverse disciplines---e.g., computational epidemiology, sustainability, social media analytics, biology, and transportation--- network datasets that can exhibit characteristics encountered in each of these disciplines becomes paramount. The key technical issue is to be able to generate synthetic topologies with pre-specified, arbitrary, degree distributions. Existing methods are limited in their ability to faithfully reproduce macro-level characteristics of networks while at the same time respecting particular degree distributions. We present a suite of three algorithms that exploit the principle of residual degree attenuation to generate synthetic topologies that adhere to macro-level real-world characteristics. By evaluating these algorithms w.r.t. several real-world datasets we demonstrate their ability to faithfully reproduce network characteristics such as node degree, clustering coefficient, hop length, and k-core structure distributions.
Social and Information Networks
What problem does this paper attempt to address?