Interactive molecular causal networks of hypertension using a fast machine learning algorithm MRdualPC

Jack Kelly,Xiaoguang Xu,James M Eales,Bernard Keavney,Carlo Berzuini,Maciej Tomaszewski,Hui Guo
DOI: https://doi.org/10.1186/s12874-024-02229-y
2024-08-02
Abstract:Background: Understanding the complex interactions between genes and their causal effects on diseases is crucial for developing targeted treatments and gaining insight into biological mechanisms. However, the analysis of molecular networks, especially in the context of high-dimensional data, presents significant challenges. Methods: This study introduces MRdualPC, a computationally tractable algorithm based on the MRPC approach, to infer large-scale causal molecular networks. We apply MRdualPC to investigate the upstream causal transcriptomics influencing hypertension using a comprehensive dataset of kidney genome and transcriptome data. Results: Our algorithm proves to be 100 times faster than MRPC on average in identifying transcriptomics drivers of hypertension. Through clustering, we identify 63 modules with causal driver genes, including 17 modules with extensive causal networks. Notably, we find that genes within one of the causal networks are associated with the electron transport chain and oxidative phosphorylation, previously linked to hypertension. Moreover, the identified causal ancestor genes show an over-representation of blood pressure-related genes. Conclusions: MRdualPC has the potential for broader applications beyond gene expression data, including multi-omics integration. While there are limitations, such as the need for clustering in large gene expression datasets, our study represents a significant advancement in building causal molecular networks, offering researchers a valuable tool for analyzing big data and investigating complex diseases.
What problem does this paper attempt to address?