Network Enhancement: a general method to denoise weighted biological networks

Bo Wang,Armin Pourshafeie,Marinka Zitnik,Junjie Zhu,Carlos D. Bustamante,Serafim Batzoglou,Jure Leskovec
DOI: https://doi.org/10.1038/s41467-018-05469-x
2018-06-02
Abstract:Networks are ubiquitous in biology where they encode connectivity patterns at all scales of organization, from molecular to the biome. However, biological networks are noisy due to the limitations of measurement technology and inherent natural variation, which can hamper discovery of network patterns and dynamics. We propose Network Enhancement (NE), a method for improving the signal-to-noise ratio of undirected, weighted networks. NE uses a doubly stochastic matrix operator that induces sparsity and provides a closed-form solution that increases spectral eigengap of the input network. As a result, NE removes weak edges, enhances real connections, and leads to better downstream performance. Experiments show that NE improves gene function prediction by denoising tissue-specific interaction networks, alleviates interpretation of noisy Hi-C contact maps from the human genome, and boosts fine-grained identification accuracy of species. Our results indicate that NE is widely applicable for denoising biological networks.
Molecular Networks,Machine Learning,Social and Information Networks
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the widespread noise in biological networks. Due to the limitations of experimental techniques and natural variation, the data in biological networks often contains a large amount of noise, which can impede the discovery of network patterns and dynamics. Specifically, technical and biological noise may lead to false strong edges (representing non - existent interactions), or make real but weak edges less obvious (these edges actually represent important biological connections). In addition, after the experimentally obtained network is corrupted by noise, the edge strengths within and between potential biological pathways may be changed, thus affecting the performance of downstream analysis. These problems are not limited to protein - protein interaction (PPI) networks, but also affect many other types of biological networks, such as Hi - C contact maps and cell - cell interaction networks. To address this challenge, the authors propose a method named "Network Enhancement" (NE), which is a diffusion - based algorithm for reducing noise in unweighted, undirected, and weighted networks. NE defines a diffusion process by using random walks and regularized information flow, aiming to remove weak edges and enhance real connections, thereby improving the performance of downstream tasks. One key point of NE is that it observes that nodes connected by high - weight edges are more likely to be directly connected by high - weight edges. Therefore, in the network generated by NE, nodes with strong similarity or interaction are connected by high - weight edges, while nodes with weak similarity or interaction are connected by low - weight edges. Mathematically, this means that the eigenvectors related to the input network are preserved, and the spectral gap of the eigenvalues is increased. NE is more aggressively down - weighted for small eigenvalues, especially when the noise is distributed in the eigen - directions corresponding to small eigenvalues, and this re - weighting is advantageous. In addition, the increased eigenvalue spacing of the enhanced network is a very attractive property because it helps to accurately detect modules / clusters and allows for higher - order network analysis. NE also provides an effective and easy - to - implement closed - form solution for the diffusion process and provides a mathematical guarantee for the convergent solution.