Prediction of Liquid-Liquid Phase Separation Proteins Using Machine Learning

Chu Xiaoquan,Sun Tanlin,Li Qian,Xu Youjun,Zhang Zhuqing,Lai Luhua,Pei Jianfeng
DOI: https://doi.org/10.1186/s12859-022-04599-w
2020-01-01
SSRN Electronic Journal
Abstract:The liquid-liquid phase separation (LLPS) of bio-molecules in cell underpins the formation of membraneless organelles, which are the condensates of protein, nucleic acid, or both, and play critical roles in cellular functions. The dysregulation of LLPS might be implicated in a number of diseases. Although the LLPS of biomolecules has been investigated intensively in recent years, the knowledge of the prevalence and distribution of phase separation proteins (PSPs) is still lag behind. Development of computational methods to predict PSPs is therefore of great importance for comprehensive understanding of the biological function of LLPS. Here, a sequence-based prediction tool using machine learning for LLPS proteins (PSPredictor) was developed. Our model can achieve a maximum 10-CV accuracy of 96.03%, and performs much better in identifying new PSPs than reported PSP prediction tools. As far as we know, this is the first attempt to make a direct and more general prediction on LLPS proteins only based on sequence information.
What problem does this paper attempt to address?