Abstract:Motivation: Epistatic miniarrary profile (EMAP) studies have enabled the mapping of large-scale genetic interaction networks and generated large amounts of data in model organisms. It provides an incredible set of molecular tools and advanced technologies that should be efficiently understanding the relationship between the genotypes and phenotypes of individuals. However, the network information gained from EMAP cannot be fully exploited using the traditional statistical network models. Because the genetic network is always heterogeneous, for example, the network structure features for one subset of nodes are different from those of the left nodes. Exponential-family random graph models (ERGMs) are a family of statistical models, which provide a principled and flexible way to describe the structural features (e.g., the density, centrality, and assortativity) of an observed network. However, the single ERGM is not enough to capture this heterogeneity of networks. In this paper, we consider a mixture ERGM (MixtureEGRM) networks, which model a network with several communities, where each community is described by a single EGRM. Results: EM algorithm is a classical method to solve the mixture problem, however, it will be very slow when the data size is huge in the numerous applications. We adopt an efficient novel online graph clustering algorithm to classify the graph nodes and estimate the ERGM parameters for the MixtureERGM. In comparison studies, the MixtureERGM outperforms the role analysis for the network cluster in which the mixture of exponential-family random graph model is developed for many ego-network according to their roles. One genetic interaction network of yeast and two real social networks (provided as supplemental materials, which can be found on the Computer Society Digital Library at http://doi.ieeecomputersociety.org/10.1109/TCBB.2017.2743711) show the wide potential application of the MixtureERGM.

Optimal Clustering of Discrete Mixtures: Binomial, Poisson, Block Models, and Multi-layer Networks

Model-based clustering and classification using mixtures of multivariate skewed power exponential distributions

Universal Lower Bounds and Optimal Rates: Achieving Minimax Clustering Error in Sub-Exponential Mixture Models

Clustering multivariate count data via Dirichlet-multinomial network fusion

Discrimination universally determines reconstruction of multiplex networks

Optimal Clustering by Lloyd Algorithm for Low-Rank Mixture Model

Discrimination Reveals Reconstructability of Multiplex Networks from Partial Observations

Network clustering analysis using mixture exponential-family random graph models and its application in genetic interaction data.

Spectral clustering in the Gaussian mixture block model

Network Analysis of Count Data from Mixed Populations

Optimal Partition and Effective Dynamics of Complex Networks

Probabilistic Framework For Network Partition

Clustering Mixtures of Bounded Covariance Distributions Under Optimal Separation

Discrete Optimal Graph Clustering

Optimal Bayesian estimators for latent variable cluster models

Clustering Mixtures with Almost Optimal Separation in Polynomial Time

A Novel Probabilistic Clustering Model for Heterogeneous Networks

A Multivariate Poisson-Log Normal Mixture Model for Clustering Transcriptome Sequencing Data

Clustering of heterogeneous populations of networks

Exact Clustering in Tensor Block Model: Statistical Optimality and Computational Limit

The parsimonious Gaussian mixture models with partitioned parameters and their application in clustering