Machine learning analysis of population-wide plasma proteins identifies hormonal biomarkers of Parkinson's Disease

Fayzan Chaudhry,Doron Betel,Olivier Elemento,Tae Wan Kim
DOI: https://doi.org/10.1101/2024.12.21.24313256
2024-12-24
Abstract:As the number of Parkinson's patients is expected to increase with the growth of the aging population there is a growing need to identify new diagnostic markers that can be used cheaply and routinely to monitor the population, stratify patients towards treatment paths and provide new therapeutic leads. Genetic predisposition and familial forms account for only around 10% of PD cases [1] leaving a large fraction of the population with minimal effective markers for identifying high risk individuals. The establishment of population-wide omics and longitudinal health monitoring studies provides an opportunity to apply machine learning approaches on these unbiased cohorts to identify novel PD markers. Here we present the application of three machine learning models to identify protein plasma biomarkers of PD using plasma proteomics measurements from 43,408 UK Biobank subjects as the training and test set and an additional 103 samples from Parkinson's Progression Markers Initiative (PPMI) as external validation. We identified a group of highly predictive plasma protein markers including known markers such as DDC and CALB2 as well as new markers involved in the JAK-STAT, PI3K-AKT pathways and hormonal signaling. We further demonstrate that these features are well correlated with UPDRS severity scores and stratify these to protective and adversarial features that potentially contribute to the pathogenesis of PD.
What problem does this paper attempt to address?