Identification of sub-Golgi protein localization by use of deep representation learning features

Zhibin Lv,Pingping Wang,Quan Zou,Qinghua Jiang
DOI: https://doi.org/10.1093/bioinformatics/btaa1074
IF: 5.8
2020-01-01
Bioinformatics
Abstract:Motivation: The Golgi apparatus has a key functional role in protein biosynthesis within the eukaryotic cell with malfunction resulting in various neurodegenerative diseases. For a better understanding of the Golgi apparatus, it is essential to identification of sub-Golgi protein localization. Although some machine learning methods have been used to identify sub-Golgi localization proteins by sequence representation fusion, more accurate sub-Golgi protein identification is still challenging by existing methodology. Results: we developed a protein sub-Golgi localization identification protocol using deep representation learning features with 107 dimensions. By this protocol, we demonstrated that instead of multi-type protein sequence feature representation fusion as in previous state-of-the-art sub-Golgi-protein localization classifiers, it is sufficient to exploit only one type of feature representation for more accurately identification of sub-Golgi proteins. Compared with independent testing results for benchmark datasets, our protocol is able to perform generally, reliably and robustly for sub-Golgi protein localization prediction. Availabilityand implementation: A use-friendly webserver is freely accessible at http://isGP-DRLF.aibiochem.net and the prediction code is accessible at https://github.com/zhibinlv/isGP-DRLF. Contact: zouquan@nclab.net or qhjiang@hit.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online.
What problem does this paper attempt to address?