Deep-learning enables proteome-scale identification of phase-separated protein candidates from immunofluorescence images

Chunyu Yu,Boyan Shen,Qi Huang,Minglei Shi,Kaiqiang You,Congying Wu,Yang Chen,Tingting Li
DOI: https://doi.org/10.1101/636738
IF: 9.5
2019-01-01
Briefings in Bioinformatics
Abstract:Intrinsically disordered region (IDR) analysis has been widely used in the screening of phase-separated proteins. However, the precise sequences determining phase separation remain unclear. Furthermore, a large number of phase-separated proteins that exhibit relatively low IDR content remain uncharacterized. Phase-separated proteins appear as spherical droplet structures in immunofluorescence (IF) images, which renders them distinguishable from non-phase-separated proteins. Here, we transformed the problem of phase-separated protein recognition into a binary classification problem of image recognition. In addition, we established a method named IDeepPhase to identify IF images with spherical droplet structures based on convolutional neural networks. Using IDeepPhase on proteome-scale IF images from the Human Protein Atlas database, we generated a comprehensive list of phase-separated candidates which displayed spherical droplet structures in IF images, allowing nomination of proteins, antibodies and cell lines for subsequent phase separation study.
What problem does this paper attempt to address?