TRENDY: Gene Regulatory Network Inference Enhanced by Transformer

Xueying Tian,Yash Patel,Yue Wang
2024-10-14
Abstract:Gene regulatory networks (GRNs) play a crucial role in the control of cellular functions. Numerous methods have been developed to infer GRNs from gene expression data, including mechanism-based approaches, information-based approaches, and more recent deep learning techniques, the last of which often overlooks the underlying gene expression mechanisms. In this work, we introduce TRENDY, a novel GRN inference method that integrates transformer models to enhance the mechanism-based WENDY approach. Through testing on both simulated and experimental datasets, TRENDY demonstrates superior performance compared to existing methods. Furthermore, we apply this transformer-based approach to three additional inference methods, showcasing its broad potential to enhance GRN inference.
Molecular Networks
What problem does this paper attempt to address?
### What problem does this paper attempt to solve? This paper aims to address the challenges in inferring gene regulatory networks (GRNs). Specifically, the authors propose a new method named TRENDY, which combines the Transformer model and the mechanism - based WENDY method to improve the accuracy of inferring GRNs from gene expression data. #### Main problems: 1. **Lack of experimentally verified GRN data**: It is very difficult to directly determine the GRN structure through experiments, so effective computational methods need to be developed to infer GRNs. 2. **Limitations of existing methods**: - **Mechanism - based methods**: Such as WENDY and NonlinearODEs. Although they consider the dynamic mechanisms of gene expression, their performance is limited when dealing with complex data. - **Information - based methods**: Such as GENIE3 and SINCERITIES. They rely on the information association between genes but ignore biological mechanisms, resulting in poor interpretability. - **Deep learning methods**: Although they perform well on many tasks, due to a large number of parameters, it is difficult to explain how they work, and they are usually used as black boxes, lacking integration with biological mechanisms. #### Solutions: - **TRENDY method**: Enhance the existing mechanism - based GRN inference method WENDY by introducing the Transformer model. The specific steps are as follows: 1. Use the Transformer model to generate a pseudo - covariance matrix to make it closer to the real covariance matrix. 2. Use the Transformer model again to directly enhance the inferred GRN. In addition, the authors also apply the Transformer model to three other GRN inference methods (GENIE3, SINCERITIES, NonlinearODEs), further demonstrating the broad potential of the Transformer model in improving GRN inference performance. #### Data generation and training: To train the Transformer model, the authors use a nonlinear and stochastic system to generate a synthetic dataset, ensuring that there is sufficient data of different GRNs for training. The specific formula is: \[ dX_j(t) = V \left( \beta \prod_{i = 1}^{n} \left[ 1+(A_{\text{true}})_{i,j} X_i(t)/(X_i(t)+1) \right]-\theta X_j(t) \right) dt+\sigma X_j(t) dW_j(t) \] where \( A_{\text{true}} \) is a randomly generated true GRN matrix, \( X_i(t) \) is the expression level of gene \( i \) at time \( t \), and \( W_j(t) \) is a standard Brownian motion. #### Experimental results: Through testing on two simulated datasets and two experimental datasets, TRENDY and its improved versions (such as GENIE3 - rev, SINCERITIES - rev, NonlinearODEs - rev) all show better performance than existing methods, especially in terms of AUROC and AUPRC metrics. In summary, this paper proposes a new GRN inference method TRENDY by combining deep learning and biological mechanisms, which significantly improves the accuracy and interpretability of inference.