DeepciRGO: Functional Prediction of Circular RNAs Through Hierarchical Deep Neural Networks Using Heterogeneous Network Features

Deng Lei,Lin Wei,Wang Jiacheng,Zhang Jingpu
DOI: https://doi.org/10.1186/s12859-020-03748-3
IF: 3.307
2020-01-01
BMC Bioinformatics
Abstract:BackgroundCircular RNAs (circRNAs) are special noncoding RNA molecules with closed loop structures. Compared with the traditional linear RNA, circRNA is more stable and not easily degraded. Many studies have shown that circRNAs are involved in the regulation of various diseases and cancers. Determining the functions of circRNAs in mammalian cells is of great significance for revealing their mechanism of action in physiological and pathological processes, diagnosis and treatment of diseases. However, determining the functions of circRNAs on a large scale is a challenging task because of the high experimental costs.ResultsIn this paper, we present a hierarchical deep learning model, DeepciRGO, which can effectively predict gene ontology functions of circRNAs. We build a heterogeneous network containing circRNA co-expressions, protein-protein interactions and protein-circRNA interactions. The topology features of proteins and circRNAs are calculated using a novel representation learning approach HIN2Vec across the heterogeneous network. Then, a deep multi-label hierarchical classification model is trained with the topology features to predict the biological process function in the gene ontology for each circRNA. In particular, we manually curated a benchmark dataset containing 185 GO annotations for 62 circRNAs, namely, circRNA2GO-62. The DeepciRGO achieves promising performance on the circRNA2GO-62 dataset with a maximum F-measure of 0.412, a recall score of 0.400, and an accuracy of 0.425, which are significantly better than other state-of-the-art RNA function prediction methods. In addition, we demonstrate the considerable potential of integrating multiple interactions and association networks.ConclusionsDeepciRGO will be a useful tool for accurately annotating circRNAs. The experimental results show that integrating multi-source data can help to improve the predictive performance of DeepciRGO. Moreover, The model also can combine RNA structure and sequence information to further optimize predictive performance.
What problem does this paper attempt to address?