Identification of candidate biomarkers and molecular networks associated with Pulmonary Arterial Hypertension using machine learning and plasma multi-Omics analysis

Vitaly Oleg Kheyfets,Andrew J Sweatt,Hui Zhang,Travis Nemkov,Paul Heerdt,Monika Dzieciatkowska,Daniel Stephenson,Ian S LaCroix,Angelo D'Alessandro,William M Oldham,Kirk Hansen,Roham T Zamanian,Kurt R Stenmark
DOI: https://doi.org/10.1101/2024.12.17.24319117
2024-12-20
Abstract:Background: Pulmonary arterial hypertension (PAH) is a rare but severe and life-threatening condition that primarily affects the pulmonary blood vessels and the right ventricle of the heart. The limited availability of human tissue for research ~most of which represents only end-stage disease~ has led to a reliance on preclinical animal models. However, these models often fail to capture the heterogeneity and complexity of the human condition. Analyzing the molecular signatures in patient plasma provides a unique opportunity to gain insights into PAH pathobiology, explore disease heterogeneity absent in animal models, and identify potential therapeutic targets. Objective: This study aims to characterize the circulating peptides, metabolites, and lipids most relevant to PAH by leveraging unbiased mass spectrometry and advanced computational tools. Building on prior research that identified individual circulating factors, this work seeks to integrate these molecular layers to better understand their interactions and collective contribution to PAH pathobiology. Methods: Peripheral blood samples were collected from 402 patients with PAH and 76 healthy individuals. Various types of molecules in the blood ~ peptides, metabolites, and lipids ~ were measured. Statistical and machine learning methods were used to identify differences between PAH patients and healthy individuals, and further to understand how these molecules might interact with each other. A survival model was also trained to examine the association between the blood molecular signature and patient outcomes. Results: Differential abundance analysis revealed 832 peptides (from 291 proteins), 45 metabolites, and 222 lipids significantly altered in PAH compared to controls. Machine learning-based feature selection identified 11 key molecules, including 2~Hydroxyglutarate, that together achieved a classification accuracy of 98.6% for PAH in a multivariate model tested on a withheld cohort. Latent network discovery uncovered 7 distinct networks, highlighting interacting molecules from pathways ~such as hypoxia, glycolysis, fatty acid metabolism, and complement activation~ that we and others have previously linked to vascular lesions in PAH patients. A survival model incorporating 155 molecular features predicted outcomes in PAH patients with a c-index of 0.762, independent of traditional clinical parameters. This model stratified patients into risk categories consistent with established markers of cardiac function, exercise tolerance, and the REVEAL 2.0 risk score. Conclusion: This study underscores the utility of integrated omics in unraveling PAH pathobiology in human subjects. Our findings highlight the central role of hypoxia signaling pathways interacting with disrupted fatty acid metabolism, complement activation, inflammation, and mitochondrial dysfunction. These interactions, revealed through latent network analysis, emphasize the metabolic and immune dysregulation underlying PAH. Furthermore, many of the molecules identified in the circulation were consistent with pathways enriched in pulmonary vascular lesions, reinforcing their biological relevance. Circulating plasma molecules from these networks demonstrated strong prognostic capabilities, comparable to current clinical risk scores, offering insights into disease progression and potential for future clinical application.
Cardiovascular Medicine
What problem does this paper attempt to address?