Identification of Putative Early Atherosclerosis Biomarkers by Unsupervised Deconvolution of Heterogeneous Vascular Proteomes.

Sarah J. Parker,Lulu Chen,Weston Spivia,Georgia Saylor,Chunhong Mao,Vidya Venkatraman,Ronald J. Holewinski,Mitra Mastali,Rakhi Pandey,Grace Athas,Guoqiang Yu,Qin Fu,Dana Troxlair,Richard Vander Heide,David Herrington,Jennifer E. Van Eyk,Yue Wang
DOI: https://doi.org/10.1021/acs.jproteome.0c00118
2020-01-01
Journal of Proteome Research
Abstract:Coronary artery disease remains a leading cause of death in industrialized nations, and early detection of disease is a critical intervention target to effectively treat patients and manage risk. Proteomic analysis of mixed tissue homogenates may obscure subtle protein changes that occur uniquely in underlying tissue subtypes. The unsupervised 'convex analysis of mixtures' (CAM) tool has previously been shown to effectively segregate cellular subtypes from mixed expression data. In this study, we hypothesized that CAM would identify proteomic information specifically informative to early atherosclerosis lesion involvement that could lead to potential markers of early disease detection. We quantified the proteome of 99 paired abdominal aorta (AA) and left anterior descending coronary artery (LAD) specimens (N = 198 specimens total) acquired during autopsy of young adults free of diagnosed cardiac disease. The CAM tool was then used to segregate protein subsets uniquely associated with different underlying tissue types, yielding markers of normal and fibrous plaque (FP) tissues in LAD and AA (N = 62 lesions markers). CAM-derived FP marker expression was validated against pathologist estimated luminal surface involvement of FP, as well as in an orthogonal cohort of "pure" fibrous plaque, fatty streak, and normal vascular specimens. A targeted mass spectrometry (MS) assay quantified 39 of 62 CAM-FP markers in plasma from women with angiographically verified coronary artery disease (CAD, N = 46) or free from apparent CAD (control, N = 40). Elastic net variable selection with logistic regression reduced this list to 10 proteins capable of classifying CAD status in this cohort with <6% misclassification error, and a mean area under the receiver operating characteristic curve of 0.992 (confidence interval 0.968-0.998) after cross validation. The proteomics-CAM workflow identified lesion-specific molecular biomarker candidates by distilling the most representative molecules from heterogeneous tissue types.
What problem does this paper attempt to address?