m5UMCB: Prediction of RNA 5-methyluridine sites using multi-scale convolutional neural network with BiLSTM

Yingshan Ji,Jianqiang Sun,Jingxuan Xie,Wei Wu,Stella C. Shuai,Qi Zhao,Wei Chen
DOI: https://doi.org/10.1016/j.compbiomed.2023.107793
IF: 7.7
2023-12-04
Computers in Biology and Medicine
Abstract:As a prevalent RNA modification, 5-methyluridine (m 5 U) plays a critical role in diverse biological processes and disease pathogenesis. High-throughput identification of m 5 U typically relies on labor-intensive biochemical experiments using various sequencing-based techniques, which are not only time-consuming but also expensive. Consequently, there is a pressing need for more efficient and cost-effective computational methods to complement these high-throughput techniques. In this study, we present m5UMCB, a novel approach that harnesses a multi-scale convolutional neural network (CNN) in tandem with bidirectional long short-term memory (BiLSTM) to recognize m 5 U sites. Our method involves segmenting RNA sequences into smaller fragments based on a 3-mer length and subsequently mapping each fragment to a lower-dimensional vector representation using the global vectors for word representation (GloVe) technique. Through a series of multi-scale convolution and pooling operations, local features are extracted from RNA sequences and transformed into abstract, high-level features. The feature matrix is then inputted into a BiLSTM network, enabling the capture of contextual information and long-term dependencies within the sequence. Ultimately, a fully connected layer is employed to classify m 5 U sites. The validation results from 5-fold cross-validation (5-fold CV) test indicate that m5UMCB outperforms existing state-of-the-art predictive methods , demonstrating a 1.98% increase in the area under ROC curve (AUC) and significant improvements in relevant evaluation metrics. We are confident that m5UMCB will serve as a valuable tool for m 5 U prediction.
engineering, biomedical,computer science, interdisciplinary applications,mathematical & computational biology,biology
What problem does this paper attempt to address?