Prediction of protein solubility based on sequence physicochemical patterns and distributed representation information with DeepSoluE

Chao Wang,Quan Zou
DOI: https://doi.org/10.1186/s12915-023-01510-8
IF: 7.364
2023-01-27
BMC Biology
Abstract:Protein solubility is a precondition for efficient heterologous protein expression at the basis of most industrial applications and for functional interpretation in basic research. However, recurrent formation of inclusion bodies is still an inevitable roadblock in protein science and industry, where only nearly a quarter of proteins can be successfully expressed in soluble form. Despite numerous solubility prediction models having been developed over time, their performance remains unsatisfactory in the context of the current strong increase in available protein sequences. Hence, it is imperative to develop novel and highly accurate predictors that enable the prioritization of highly soluble proteins to reduce the cost of actual experimental work.
biology
What problem does this paper attempt to address?