Identifying piRNA targets on mRNAs in C. elegans using a deep multi-head attention network

Tzu-Hsien Yang,Sheng-Cian Shiue,Kuan-Yu Chen,Yan-Yuan Tseng,Wei-Sheng Wu
DOI: https://doi.org/10.1186/s12859-021-04428-6
IF: 3.307
2021-10-16
BMC Bioinformatics
Abstract:Abstract Background Piwi-interacting RNAs (piRNAs) are the small non-coding RNAs (ncRNAs) that silence genomic transposable elements. And researchers found out that piRNA also regulates various endogenous transcripts. However, there is no systematic understanding of the piRNA binding patterns and how piRNA targets genes. While various prediction methods have been developed for other similar ncRNAs (e.g., miRNAs), piRNA holds distinctive characteristics and requires its own computational model for binding target prediction. Results Recently, transcriptome-wide piRNA binding events in C. elegans were probed by PRG-1 CLASH experiments. Based on the probed piRNA-messenger RNAs (mRNAs) binding pairs, in this research, we devised the first deep learning architecture based on multi-head attention to computationally identify piRNA targeting mRNA sites. In the devised deep network, the given piRNA and mRNA segment sequences are first one-hot encoded and undergo a combined operation of convolution and squeezing-extraction to unravel motif patterns. And we incorporate a novel multi-head attention sub-network to extract the hidden piRNA binding rules that can simulate the biological piRNA target recognition process. Finally, the true piRNA–mRNA binding pairs are identified by a deep fully connected sub-network. Our model obtains a supreme discriminatory power of AUC $$=$$ = 93.3% on an independent test set and successfully extracts the verified binding pattern of a synthetic piRNA. These results demonstrated that the devised model achieves high prediction performance and suggests testable potential biological piRNA binding rules. Conclusions In this research, we developed the first deep learning method to identify piRNA targeting sites on C. elegans mRNAs. And the developed deep learning method is demonstrated to be of high accuracy and can provide biological insights into piRNA–mRNA binding patterns. The piRNA binding target identification network can be downloaded from http://cosbi2.ee.ncku.edu.tw/data_download/piRNA_mRNA_binding .
biochemical research methods,biotechnology & applied microbiology,mathematical & computational biology
What problem does this paper attempt to address?