Uncovering critical transitions and molecule mechanisms in disease progressions using Gaussian graphical optimal transport

Wenbo Hua,Ruixia Cui,Heran Yang,Jingyao Zhang,Chang Liu,Jian Sun
DOI: https://doi.org/10.1101/2024.04.24.590914
2024-04-28
Abstract:Understanding disease progression is crucial for detecting critical transitions and finding trigger molecules, facilitating early diagnosis interventions. However, the high dimensionality of data and the lack of aligned samples across disease stages have posed challenges in addressing these tasks. We present a novel framework, Gaussian Graphical Optimal Transport (GGOT), for analyzing disease progressions. The proposed GGOT uses Gaussian graphical models, incorporating protein interaction networks, to characterize the data distributions at different disease stages. Then we use population-level optimal transport to calculate the Wasserstein distances and transport maps between stages, enabling us to detect critical transitions. By analyzing the per-molecule transport distance, we quantify the importance of each molecule and identify trigger molecules. Moreover, GGOT predicts the occurrence of critical transitions in unseen samples and visualizes the disease progression process. We apply GGOT to the simulation dataset and six disease datasets with varying disease progression rates, to show its effectiveness for detecting critical transitions and identifying trigger molecules.
Bioinformatics
What problem does this paper attempt to address?
### Problems the Paper Aims to Solve This paper aims to address the detection of critical transitions during disease progression and the identification of trigger molecules. Specifically: 1. **Dynamic Analysis of Disease Progression**: - Understanding that disease progression is a dynamic process involving multiple pathological stages. - Critical transitions refer to sudden and irreversible deterioration points in the disease state. 2. **Challenges and Difficulties**: - High-dimensional data and difficulty in aligning samples from different disease stages. - Patient states may change little near the critical point, leading to sudden and hard-to-detect transitions. - Factors such as noise, patient heterogeneity, sample imbalance, and model inaccuracy affect the reliable detection of critical transitions. - Disease progression is driven by numerous molecules, with gene networks exhibiting complex nonlinear behavior. 3. **Proposed Method**: - A new framework called Gaussian Graphical Optimal Transport (GGOT) is proposed for analyzing disease progression. - GGOT uses a Gaussian graphical model embedded with a protein-protein interaction network (PPI) to describe data distributions at different disease stages. - It employs population-level optimal transport to compute the Wasserstein distance and transport mapping between stages to detect critical transitions. - By analyzing the transport distance of each molecule, it quantifies the importance of each molecule and identifies trigger molecules. - GGOT can also predict the occurrence of critical transitions in unseen samples and visualize the disease progression process. Through these methods, the paper addresses the limitations of existing methods in handling complex disease data, such as sample imbalance and noise impact, and validates the effectiveness of GGOT on simulated data and six real disease datasets. These findings further demonstrate the advantages of the proposed method.