AnnoPRO: an Innovative Strategy for Protein Function Annotation Based on Image-like Protein Representation and Multimodal Deep Learning
Lingyan Zheng,Shuiyang Shi,Fang Pan,Hongning Zhang,Zhengqiang Pan,Zhenhua Huang,Weiqi Xia,Honglin Li,Zhenyu Zeng,Shun Zhang,Yu Zong Chen,Mingkun Lu,Zhaorong Li,Feng Zhu
DOI: https://doi.org/10.1101/2023.05.13.540619
2023-01-01
Abstract:Protein function annotation has been one of the longstanding issues, which is key for discovering drug targets and understanding physiological or pathological process. A variety of computational methods have therefore been constructed to facilitate the research developments in this particular direction. However, the annotation of protein function based on computational methods has been suffering from the serious “ long-tail problem ”, and it remains extremely challenging for existing methods to improve the prediction accuracies for protein families in tail label levels . In this study, an innovative strategy, entitled ‘ AnnoPRO ’, for protein function annotation was thus constructed. First , a novel method enabling image-like protein representations was proposed. This method is unique in capturing the intrinsic correlations among protein features, which can greatly favor the application of the state-of-the-art deep learning methods popular in image classification. Second , a multimodal framework integrating multichannel convolutional neural network and long short-term memory neural network was constructed to realize a deep learning-based protein functional annotation. Since this framework was inspired by a reputable method used in image classification for dealing with its ‘ long-tail problem ’, our AnnoPRO was expected to significantly improve the annotation performance of the protein families in tail label level . Multiple case studies based on benchmark were also conducted, which confirmed the superior performance of AnnoPRO among the existing methods. All source codes and models of AnnoPRO were freely available to all users at https://github.com/idrblab/AnnoPRO , and would be essential complement to existing methods.