Abstract:Background: The recent explosion in biological and other real-world network data has created the need for improved tools for large network analyses. In addition to well established global network properties, several new mathematical techniques for analyzing local structural properties of large networks have been developed. Small over-represented subgraphs, called network motifs, have been introduced to identify simple building blocks of complex networks. Small induced subgraphs, called graphlets, have been used to develop "network signatures" that summarize network topologies. Based on these network signatures, two new highly sensitive measures of network local structural similarities were designed: the relative graphlet frequency distance (RGF-distance) and the graphlet degree distribution agreement (GDD-agreement). Finding adequate null-models for biological networks is important in many research domains. Network properties are used to assess the fit of network models to the data. Various network models have been proposed. To date, there does not exist a software tool that measures the above mentioned local network properties. Moreover, none of the existing tools compare real-world networks against a series of network models with respect to these local as well as a multitude of global network properties. Results: Thus, we introduce GraphCrunch, a software tool that finds well-fitting network models by comparing large real-world networks against random graph models according to various network structural similarity measures. It has unique capabilities of finding computationally expensive RGF-distance and GDD-agreement measures. In addition, it computes several standard global network measures and thus supports the largest variety of network measures thus far. Also, it is the first software tool that compares real-world networks against a series of network models and that has built-in parallel computing capabilities allowing for a user specified list of machines on which to perform compute intensive searches for local network properties. Furthermore, GraphCrunch is easily extendible to include additional network measures and models. Conclusion: GraphCrunch is a software tool that implements the latest research on biological network models and properties: it compares real-world networks against a series of random graph models with respect to a multitude of local and global network properties. We present GraphCrunch as a comprehensive, parallelizable, and easily extendible software tool for analyzing and modeling large biological networks. The software is open-source and freely available at http://www.ics.uci.edu/~bio-nets/graphcrunch/. It runs under Linux, MacOS, and Windows Cygwin. In addition, it has an easy to use on-line web user interface that is available from the above web page.

An Empirical Comparison of Big Graph Frameworks in the Context of Network Analysis

Evaluation and Analysis of Distributed Graph-Parallel Processing Frameworks

Experimental Analysis of Distributed Graph Systems

A Distributed Graph-Parallel Computing System with Lightweight Communication Overhead

LightGraph: Lighten Communication in Distributed Graph-Parallel Processing

Scalable Graph Algorithms

Graph Computing Systems for Large Scale Graph Analysis

Efficient Processing of Very Large Graphs in a Small Cluster

Distributed structural clustering on large graph

Large-Scale Graphs Community Detection using Spark GraphFrames

EasyGraph: A Multifunctional, Cross-Platform, and Effective Library for Interdisciplinary Network Analysis

High-Level Programming Abstractions for Distributed Graph Processing

Graphine: Programming Graph-Parallel Computation of Large Natural Graphs on Multicore Cluster

GRE: A Graph Runtime Engine for Large-Scale Distributed Graph-Parallel Applications.

Real-World Graph Analysis: Techniques for Static, Dynamic, and Temporal Communities

DRONE: a Distributed Subgraph-Centric Framework for Processing Large Scale Power-law Graphs

Graph Processing Framework Supporting Elastic Scalability in Distributed Shared Environment

A Feasible Graph Partition Framework for Parallel Computing of Big Graph

Computation of K-Core Decomposition on Giraph

Big Graph Management Based on Scalable Computing Platforms

GraphCrunch: a tool for large network analyses