A New Preconditioner for the GeneRank Problem

Davod Khojasteh Salkuyeh,Vahid Edalatpour,Davod Hezari
DOI: https://doi.org/10.48550/arXiv.1403.3925
2014-03-16
Abstract:Identifying key genes involved in a particular disease is a very important problem which is considered in biomedical research. GeneRank model is based on the PageRank algorithm that preserves many of its mathematical properties. The model brings together gene expression information with a network structure and ranks genes based on the results of microarray experiments combined with gene expression information, for example from gene annotations (GO). In the present study, we present a new preconditioned conjugate gradient algorithm to solve GeneRank problem and study its properties. Some numerical experiments are given to show the effectiveness of the suggested preconditioner.
Numerical Analysis
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: in biomedical research, how to effectively identify genes related to specific diseases. Specifically, the author proposes a new pre - conditioned conjugate gradient algorithm to solve the GeneRank problem and studies the properties of this algorithm. ### Background and Problem Description 1. **Gene Prioritization Problem**: - In post - genomic medical research, identifying genes related to specific diseases is a major challenge. This not only helps to better understand disease mechanisms but is also usually the first step in finding treatments. - Modern techniques usually report hundreds or even thousands of genes related to specific diseases, so effective gene - disease prioritization methods are needed to screen out the genes most likely to be related to the disease. 2. **GeneRank Model**: - The GeneRank model is based on the PageRank algorithm, combines gene expression information and network structure, and ranks genes through microarray experiment results and other gene expression information (such as Gene Ontology (GO)). - The goal of this model is to find the genes most relevant to a specific disease from a large number of candidate genes and rank them at the top. 3. **Existing Methods and Their Limitations**: - Existing methods can be roughly divided into two categories: one mainly uses microarray expression data, and the other combines multiple data sources (such as sequence information, protein - protein interaction data, etc.). - These methods face the problem of low computational efficiency when dealing with large - scale linear equations, especially when α is close to 1. ### The Solution in the Paper To improve the efficiency of solving the GeneRank problem, the author proposes a new preconditioner and applies it to the Conjugate Gradient (CG) algorithm. The specific steps are as follows: 1. **Constructing the Linear System**: - The GeneRank problem can be represented as a large - scale asymmetric linear system: \[ (I - \alpha W D^{-1})x=(1 - \alpha)e_x \] where \(I\) is the identity matrix, \(\alpha\) is the damping factor (\(0 < \alpha < 1\)), \(W\) is the adjacency matrix of the gene network, \(D\) is the diagonal matrix, and \(e_x\) is the vector of the absolute values of gene expression changes. 2. **Preconditioner Design**: - The author proposes a new preconditioner \(M_\alpha = I + J_\alpha\), where \(J_\alpha=\alpha D^{-\frac{1}{2}}W D^{-\frac{1}{2}}\). - After using this preconditioner, the linear system becomes: \[ M_\alpha S_\alpha\bar{x}=M_\alpha b_\alpha \] where \(S_\alpha = I - J_\alpha\), \(b_\alpha=(1 - \alpha)D^{-\frac{1}{2}}e_x\). 3. **Theoretical Analysis**: - The author proves that \(M_\alpha S_\alpha\) is a symmetric positive definite M - matrix, and its spectral condition number is not higher than that of the original coefficient matrix \(S_\alpha\). - This means that the new preconditioner can significantly improve the eigenvalue distribution, thereby accelerating the convergence speed of the CG algorithm. 4. **Numerical Experiments**: - The effectiveness of the new preconditioner has been verified through multiple experiments. The results show that the new method is superior to the existing Jacobi preconditioner and Chebyshev iterative method in terms of the number of iterations and CPU time. ### Conclusion The paper proposes a new preconditioner for solving the GeneRank problem and verifies its effectiveness through theoretical analysis and numerical experiments. This method not only improves computational efficiency but also provides a more reliable tool for gene prioritization.