Machine learning-based analysis identifies and validates serum exosomal proteomic signatures for the diagnosis of colorectal cancer

Haofan Yin,Jinye Xie,Shan Xing,Xiaofang Lu,Yu Yu,Yong Ren,Jian Tao,Guirong He,Lijun Zhang,Xiaopeng Yuan,Zheng Yang,Zhijian Huang
DOI: https://doi.org/10.1016/j.xcrm.2024.101689
2024-08-20
Abstract:The potential of serum extracellular vesicles (EVs) as non-invasive biomarkers for diagnosing colorectal cancer (CRC) remains elusive. We employed an in-depth 4D-DIA proteomics and machine learning (ML) pipeline to identify key proteins, PF4 and AACT, for CRC diagnosis in serum EV samples from a discovery cohort of 37 cases. PF4 and AACT outperform traditional biomarkers, CEA and CA19-9, detected by ELISA in 912 individuals. Furthermore, we developed an EV-related random forest (RF) model with the highest diagnostic efficiency, achieving AUC values of 0.960 and 0.963 in the train and test sets, respectively. Notably, this model demonstrated reliable diagnostic performance for early-stage CRC and distinguishing CRC from benign colorectal diseases. Additionally, multi-omics approaches were employed to predict the functions and potential sources of serum EV-derived proteins. Collectively, our study identified the crucial proteomic signatures in serum EVs and established a promising EV-related RF model for CRC diagnosis in the clinic.
What problem does this paper attempt to address?