Machine Learning Enables Comprehensive Prediction of the Relative Protein Abundance of Multiple Proteins on the Protein Corona

Xiuhao Fu,Chao Yang,Yunyun Su,Chunling Liu,Haoye Qiu,Yanyan Yu,Gaoxing Su,Qingchen Zhang,Leyi Wei,Feifei Cui,Quan Zou,Zilong Zhang
DOI: https://doi.org/10.34133/research.0487
2024-09-25
Abstract:Understanding protein corona composition is essential for evaluating their potential applications in biomedicine. Relative protein abundance (RPA), accounting for the total proteins in the corona, is an important parameter for describing the protein corona. For the first time, we comprehensively predicted the RPA of multiple proteins on the protein corona. First, we used multiple machine learning algorithms to predict whether a protein adsorbs to a nanoparticle, which is dichotomous prediction. Then, we selected the top 3 performing machine learning algorithms in dichotomous prediction to predict the specific value of RPA, which is regression prediction. Meanwhile, we analyzed the advantages and disadvantages of different machine learning algorithms for RPA prediction through interpretable analysis. Finally, we mined important features about the RPA prediction, which provided effective suggestions for the preliminary design of protein corona. The service for the prediction of RPA is available at http://www.bioai-lab.com/PC_ML.
What problem does this paper attempt to address?